Open-source, high-throughput targeted in situ transcriptomics for developmental and tissue biology

ABSTRACT Multiplexed spatial profiling of mRNAs has recently gained traction as a tool to explore the cellular diversity and the architecture of tissues. We propose a sensitive, open-source, simple and flexible method for the generation of in situ expression maps of hundreds of genes. We use direct ligation of padlock probes on mRNAs, coupled with rolling circle amplification and hybridization-based in situ combinatorial barcoding, to achieve high detection efficiency, high-throughput and large multiplexing. We validate the method across a number of species and show its use in combination with orthogonal methods such as antibody staining, highlighting its potential value for developmental and tissue biology studies. Finally, we provide an end-to-end computational workflow that covers the steps of probe design, image processing, data extraction, cell segmentation, clustering and annotation of cell types. By enabling easier access to high-throughput spatially resolved transcriptomics, we hope to encourage a diversity of applications and the exploration of a wide range of biological questions.


4)
To encourage the scientific community to use RNA-ISS in a diversity of applications, it will be useful to include the cost breakdown of the method (e.g., per sample or per slide) and a comparison with the previous versions HybISS and HybRISS.

5)
Finally, I would recommend using a title with a broader scope as this method can be useful beyond developmental biology.

Advance summary and potential significance to field
The authors describe an improvement to their in situ sequencing method to detect RNA transcripts in a variety of organs and organisms.First the authors introduce the use of the T4RNALigase2 to ligate the two ends of a padlock probe for rolling circle amplification.Previous iterations of the method could only ligate the ends on a DNA-DNA duplex, which first required the reverse transcription of the RNA.The reverse transcription has low efficiency and by circumventing this step, the sensitivity of the method is increased.Next, they apply the method on mouse, rat, chicken and fly tissue to demonstrate versatility of the method.Lastly, the authors demonstrate that the same probe set can be used on different but related species to detect the same gene.
In addition to the introduction of the ligase, the authors put considerable effort in sharing their method.In my opinion this is an important and timely aspect of the manuscript, because the whole field of spatially resolved transcriptomics is rapidly commercializing.Even though this trend is giving more researchers access to these technologies, the practical knowledge and knowhow in academia is disappearing to these companies.With this manuscript and the online resources (protocol, guide, notebooks code), the authors enable others to build in-house versions of the technology, and flexibly adept it to their use.

Comments for the author
The authors present an improvement of the popular ISS protocol and make admirable effort to share such a complex method with protocols, instructions and code.However, the manuscript would benefit from a more thorough evaluation and comparison to other methods, to fully evaluate the advance made.I would therefore suggest the following revisions: In my opinion the introduction of T4RNALigase2 to the ISS protocol to increase the sensitivity is the major advance of this manuscript.This builds upon the previous work of the Nilsson group (Krzywkowski et al. 2019 RNA) and follows their recent benchmark of the CARTANA kit for ISS, which also uses ligation of the padlock probe directly on the RNA (Lee et al. 2022 Scientific Reports).Now that the CARTANA kit is no longer commercially available and the whole method is commercialized by 10X Genomics, this work provides an important resource to enable home-build ISS without commercial dependence.However, the manuscript would benefit from a more thorough validation of the protocol and detection sensitivity.Specifically: 1) The conclusion of the manuscript that RNA-ISS is better than cDNA-ISS is predominantly based on Supplementary table 1, where 3 genes in two mouse brain regions are compared.First, it would be helpful to report the standard deviation of the fold increase.Second, it is unclear why these genes are chosen and why only the thalamus and ventricle are considered in the comparison.It would be better to perform the analysis of the whole mouse brain section if possible.Furthermore, I would strongly suggest that plots of the raw data are included in the manuscript.Ideally showing the whole slide and zoomed in signal dots with nuclei segmentation outlines.In addition, as average numbers of transcript per nucleus are reported it would be good to include the number of cells in each condition.If feasible, more genes should be included to get a better estimate of the fold increase. 2) The methods indicate that tissue sections with a thickness between 10 to 20 micrometers are used for this protocol.Are the cDNA-ISS and RNA-ISS conditions performed on the same thickness?Even though transcripts per nuclei are used in the calculation, section thickness could influence the counts especially because the images are flattened with a maximum projection in Z.It would be helpful if the authors could report the number of nuclei in each section as proxy for the slice thickness.

3)
Please include a protocol for the cDNA-ISS data used, or a literature reference if the dataset was already published.Also include a description of how the comparison is made in the methods.

4)
To further access sensitivity between cDNA-ISS and RNA-ISS an analysis similar to Lee et al. 2022 Scientific Reports, where the distribution of counts are directly compared (Fig 1b-e) would strengthen the manuscript.Furthermore, since many methods in this field assess their sensitivity by comparing to smFISH based data, this analysis would help place the new method in a common reference frame.Lastly, a comparison to 10X Xenium would also be interesting if possible.
The introduction, Figure 1D and online code also covers cell segmentation and clustering of ISS data.However, no demonstrations of these steps are given in the manuscript.An addition of this, for instance on the chicken data, would demonstrate the full pipeline.Because these algorithms wrap existing tools that are independently published (PCIseq, Cellpose and Scanpy) a thorough evaluation is not needed in my opinion, but an example would complete the claims made in the introduction.
Combining ISS with protein detection is a very promising capability which cannot easily be obtained with other spatially resolved transcriptomics techniques.However, in previous work the Nilsson group already demonstrated that this was possible (Mbp protein in Lee et al. 2022 Scientific reports).Here this is again demonstrated with the GFAP protein.A stronger case would be made if multiple proteins are detected, because it is likely that not all epitopes are equally well preserved during the ISS protocol.However, this is a major effort and therefore toning down the claims would also suffice.
The authors suggest that next to EdU birth dating, other fluorescent reporters can be used in combination with ISS.However, if genetically encoded fluorescent proteins are used, will their fluorescence not interfere with the ISS signal?If they first need to be degraded would this affect the ability to use immunohistochemistry after barcode imaging?
The manuscript should better clarify to the reader that the basis for this protocol is HybISS (Gyllborg et al.NAR) rather than the original ISS protocol by Ke et al. 2013 Nature Methods.
The cross-reactive design of probes between species, is an interesting idea and can save resources in specific experimental situations where closely related species are studied.However, it also suggests that RNA-ISS could suffer from more false positives.Could the authors elaborate on this and include a measurement of false positives?
For the simulation of the Macaque probes the authors note that a human probe panel has 74% matches, 12% has no-match and 14% has off-target matches to the Macaque transcriptome.However, a discussion and conclusion lack whether that would be enough to yield high quality data.
Figure 2 A-E, why is the background so different between the images?
For none of the presented experiments the full dataset with all detected genes is shown simultaneously in a plot.Adding this would be valuable to show the alignment performance of the algorithm and give a better impression of the capabilities of the presented protocol and analysis code.

Minor comments:
The keyword "Multi-omics" is a stretch because only one protein is detected using immunohistochemistry alongside the spatial transcriptome.

Introduction:
Clarify that "sequencing-based" methods use Next Generation Sequencing, to prevent confusion with ISS.
The authors describe the rolling circle amplification products to be "large", "very bright" and "massively amplified".It would be more informative to specify the (approximate) size, and the amount of amplification.On a similar note, please specify what size range a "large tissue section" is.
The authors list the detection of heavily fixed mRNAs as one of the limitations of ISS, and in the next paragraph summarize that the manuscript overcomes these limitations.However, no prove (or reference to previous work) is given that indicates that over fixing is an issue.Nor, is it demonstrated that the new protocol overcomes this specific issue.

Results and discussion:
For the proof of principle experiments in Drosophila, mouse and chicken it would be helpful to see the images of the in situ reference databases side-by-side with the ISS data.A reference to the databases should also be included in the main text (Not only in the figure legend).
Include additional information on the homology between the engrailed gene in the different species.It would also be helpful to add percentages of nucleotides that differ inside the probed region between the species.Figure 3D-G, please add a label that this is rat brain.Possibly best to do this for all species in all figures.

Material and methods:
The sequences of all probes should be available in a supplemental file.Manufacturers all of chemicals and reagents should be included.
To which samples was TrueBlack applied?
Which objective was used to image the RCPs?How many Z-stack were imaged?In the filtering of spots a quality score of 0.5 is used.How is this value determined and is there any ground truth data that is used to establish this value?

ISS manual:
In the filename "TileScan 0--Stage01--C03.tif" the index after "Stage" suggests that a maximum of 100 FOVs can be included (Stage00 to Stage99).Is the algorithm limited to 100FOVs?Please give an indication of how large a large dataset is.This will help users prepare the hardware.

Code:
The authors could consider putting a minimal raw dataset online for first time users to try.

Author response to reviewers' comments
We thank the reviewers for their constructive criticism and positive feedback.We believe their comments were very fair, and we think that the suggested revisions significantly increased the quality of our work.
To facilitate the reviewers' work, and post-peer review re-assessment of our manuscript, we deposited all the data from all the datasets mentioned in the text on the permanent viewer at: https://lee2024supp.serve.scilifelab.se/We added this information in the Data availability section If the reviewers and editors agree, our proposal is to cite this link in the paper as the container for all ISS supplementary data.We think this is the most intuitive way to browse all the datasets in an interactive manner, as an alternative to many pages of supplementary data showing individual gene expression patterns.
However, we also understand that permanent accessibility to these datasets is crucial for reproducibility, and we wish to reassure both the reviewers and the editors about our long term commitment: the ScilifeLab-serve hosting infrastructure will allow users to get a DOI for the published datasets by the end of the year, and we will link this information to the publication whenever it becomes available to us.
We hope this way of providing the supporting data is sufficient, but we're very open to different arrangements if the reviewers or editors consider them more appropriate (ie.supplementary figures and/or sharing the full datasets on Figshare).In the meantime, we hope these viewers at least facilitate the reviewers' task.
Here goes a point by point response to the comments: Both reviewers suggest a more detailed comparison of RNA-ISS against the previous cDNAbased protocol.We agree this needs to be presented as a main figure in the text and supported by more extensive data.
Reviewer 1: 1) Efficiency of RNA-ISS: The authors present RNA-ISS that use chimeric padlocks and T4RNALigase2 to obtain ~2-fold increase in sensitivity in ISS.This seems to be one of the major contributions of the article considering their previously described HybISS approach based on cDNA (Gyllborg et al NAR 2020).However, the evidence for this key result needs to be more extensive and part of the main text/figures of the article.The Supplementary table 1 does not suffice to convince the readers of the results.I suggest including more genes (~10) in the comparison with different levels of gene expression and some images of cDNA vs dsRNA ISS.Moreover, the authors assert potential advantages of the method, like a 1) higher number of transcripts per cell 2) genes with lower expression levels can be more efficiently detected 3) informative signal density can likely be extracted with a reduced set of probes, which might be useful for particular applications (detection of short mRNAs, isoform discrimination, etc.).Points 1 and 2 can be addressed with the suggestion above.It will be interesting if the authors show the performance of RNA-ISS vs cDNA in short mRNAs or isoform discrimination.Finally, since ISS is a single cell resolution method, a zoom in of the large images of the tissues will be helpful when comparing cDNA vs RNA-ISS.
As suggested by both reviewers, we increased the number of analyzed genes (9), and quantified the overall sensitivity increase on a larger tissue area, and included a plot of the number of reads/cells for all the genes across the 2 methods.From this experiment, our conclusions are: • The average sensitivity increase of RNA-ISS vs cDNA-ISS, keeping fixed the number of probes across technologies, is in the order of 2x.• This observed sensitivity increase is variable depending on the probed gene.While in some cases the sensitivity increase is marginal, the efficiency gain can be much higher for other genes.It's still unclear to us where this variability emerges from, but possibly further work on refining the probe design criteria will be needed to stabilize this variation.
We also added new figure panels with zoomed insets, so for readers to better appreciate the increased sensitivity at single cell resolution.
Although the idea of testing our new chemistry and comparing it with the previous cDNA iteration on the specific tasks of small RNA detection or isoform discrimination is appealing, we feel this goes beyond the scope of this work.Successful detection will be variable on a case-by-case scenario and will depend on an interplay of factors (chemistry efficiency, number of probes, expression levels).We opted to remove the claim from the main text.
2) Supplementary information: The authors didn't include information regarding chimeric padlock sequences, L-probes, detection oligos, codebooks, Etc.Also, the authors showed the gene expression patterns for 4 genes (out of 15) in the mouse coronal sections, and 4 genes (out of 35) for chicken optic tectum.I understand the data is shown in the accompanying paper (Eneritz Rueda-Alaña, et al. BirthSeq, a new method to isolate and analyze dated cells from any tissue in vertebrates).However, the sequences and rest of the images need to be included as supplementary data.Finally, a side-by-side comparison with an atlas (ABA, Geisha Arizona database) will help to understand the specificity of the gene patterns.
We apologize for the oversight.The supplementary data about probes, detection oligos, codebooks etc. is now added as supplementary information to the paper.
We also produced a supplementary figure showing the side-by-side comparison between ISS-detected genes and a sum-up scheme of previously described expression patterns in Drosophila.
Regarding the images of the mouse and rat brains, and the chicken optic data for all the genes not directly represented in the respective figures, we deposited the data into our permanent interactive TissUUmaps viewer at the link indicated above.The authors applied RNA-ISS in a single fluorescent detection oligo manner to 'disentangle potential artifacts of the chemistry from issues that might arise from the downstream computational decoding'.A fair comparison will be to quantify the number of RNA molecules detected using a single readout detection vs combinatorial barcoding (which needs computational decoding), and measure their correlation, either in drosophila, mouse, or chicken.
Reviewer 1 touches here on a very important point: how much data do we lose due to computational decoding?How does this data loss relate to the experimental settings?More crucially, can this data loss be mitigated using experimental, imaging, and data processing strategies?
On one hand, ISS is designed to be a highly multiplexed method, so computational decoding is a crucial aspect of it.On the other hand, our decoding strategy requires the detection of individual discrete signal spots, so we need imaging conditions that allow that spot detection to happen successfully.
In other words: our ability to successfully decode depends on how optically dense the images are.
The implication is that our data loss caused by computational decoding is largely contextdependent, and affected by several variables: the relative density of a specific mRNA, its relative expression compared to the surrounding mRNAs, how many genes are co-decoded in the same experiment, the chosen decoding strategy etc… A direct comparison between computational and non-computational expression analysis might therefore be only partially informative, as well as potentially misleading in some cases.We believe a more informative comparison is instead to show how computational decoding is affected by data density, under different imaging and data processing conditions.To get our point across, we'd like to share the following analysis with the reviewers.
We designed a benchmarking experiment as follows: 1) We imaged the same exact tissue area first using a 20x objective, then with a 40x objective, and performed a parallel decoding of the 2 datasets.2) We then deconvolved both raw image sets, and repeated the decoding on the deconvolved images.3) Finally, we applied image restoration using CARE to the projected raw images and repeated the decoding once again.This was to test the performance of CARE denoising against real deconvolution.
The hypothesis is that, if computational decoding is negatively affected by optical crowding, 20x should perform worse than 40x, and raw should perform worse than deconvolved, both in terms of absolute read count and quality metrics.The top performance is expected at high-magnification imaging followed by deconvolution.
For all the conditions we extracted 2 values, represented in the figure below.The first is the number of raw reads (light-transparent bars), while the second is the number of reads passing our internal quality threshold (solid bars), each column representing an experimental condition.
These results strongly suggest that optical crowding leads indeed to suboptimal spot detection and to a lower count of high-quality spots.Prospective users will need to critically evaluate the imaging and decoding strategy to the specific parameters of their particular experiments.We now included this analysis and its conclusions in the Supplementary Manual, under the paragraph 'Technical notes and considerations for successful ISS experiments'.
2) The authors show cross-reactive padlock probes in closely related species.It will be helpful to show as supplementary information the sequence alignment between different species and the region where the padlocks bind.
We added this information in the supplementary table.
Even more useful will be to include in the python pipeline the percentage of homology of the designed padlocks with the closely related species.With this, in a single run of the padlock design pipeline, it will be possible to select the probes that will work (or not) in closely related species.
We acknowledge that this would be a very useful function.However, designing cross-reactive probes is not only a matter of specifying a homology percentage, but also about making sure the probes are also sufficiently specific across species.One risk, in case of taxon-specific gene duplications, is that a probe designed against one gene in species A might recognise multiple paralogues in species B. Including this cross checks for multiple species in a single-run of the pipeline requires adding some specific functionalities.We are however working to set up a dashbased web app to simplify the probe design for novel users, and we'll include this functionality in it by specifying a custom workflow.
Regarding the mouse/rat data: some genes (e.g.Cck) seems to be highly expressed in rats compared to mice.To know if this difference is due to the species rather than the specificity to the padlock probes, please add sequencing (RNA-seq, scRNA-seq) or gene atlases data if available.
This result was very puzzling also to us, especially considering that the probes were designed to have perfect matches on mouse, and allow a few mismatches on rat.More intriguingly, often we have cases where we could design 5 probes for mouse, but only a subset of those are predicted to cross-react in Rat.If anything, one would have expected the probes to be more sensitive in the former rather than in the latter.Unless, of course, the increased signal in Rat is produced by spurious binding to unwanted targets, as hinted by Reviewer 1. Upon checking once again the predicted specificity of the probes via multiple means (using the pattern matching function of cutadapt, as in our probe design pipeline, or via BLAST), we have confirmed that these probes should not have detectable off-targets in Rat, at least using the currently available transcriptomic information.We acknowledge, however, that the quality of transcriptomic annotation might be lower in the case of Rat compared Mouse.We couldn't find informative datasets to confirm our finding, and we are not sure having them would have resolved our (and the reviewer's) doubts.We speculate that the difference in detection abundance might arise because of one or more of several factors: • The first factor is the number of cells.A rat brain is considerably larger than a mouse one, so the total count of detections is expected to be higher for that reason, assuming equal expression at the single cell level.To some extent this seems to be one of the sources of variation.• However, zooming in on individual cells in the 2 datasets shows that clearly the Rat data looks more dense than the mouse one also at the single cell level (a very clear example is Cck), so the number of cells doesn't seem the only reason behind this difference.• The fact that the expression patterns look very specific would suggest that the probes are working as they should.Specificity-wise, a possibility we can't account for is that these probes recognise non-annotated transcripts that were not captured by our specificity check, or by our more recent second check via BLAST.If this is happening, we have no way to know, acknowledging that working with incompletely annotated transcriptomes is tricky and might produce misleading results.• We also speculate that, besides real differences in the gene expression levels across the 2 species, a potential cause of this difference could also be the fact that these samples might have had a different RNA quality to begin with.
We incorporated these discussion points in the section 'Technical notes and considerations for successful ISS experiments' of the Supplementary Manual.
3) Either in 'RNA-ISS Library prep' or protocols.io,please provide the company and catalogue number for all the reagents used.
We apologize for the lack of clarity in our initial submission.The company, product name and catalog number of reagents are now documented on protocols.ioas well as in the supplementary table.

4) To encourage the scientific community to use RNA-ISS in a diversity of applications, it will be useful to include the cost breakdown of the method (e.g., per sample or per slide) and a comparison with the previous versions HybISS and HybRISS.
This is a very fair point, and a very useful information for prospective users.The cost breakdown per 100ul reaction for library preparation is now documented in the supplementary table.
5) Finally, I would recommend using a title with a broader scope as this method can be useful beyond developmental biology.
We thank the reviewer for the suggestion.The new proposed title is: "Open-source, high-throughput targeted in-situ transcriptomics for developmental and tissue biology" Reviewer 2: 1) The conclusion of the manuscript that RNA-ISS is better than cDNA-ISS is predominantly based on Supplementary table 1, where 3 genes in two mouse brain regions are compared.First, it would be helpful to report the standard deviation of the fold increase.Second, it is unclear why these genes are chosen and why only the thalamus and ventricle are considered in the comparison.It would be better to perform the analysis of the whole mouse brain section if possible.Furthermore, I would strongly suggest that plots of the raw data are included in the manuscript.Ideally showing the whole slide and zoomed in signal dots with nuclei segmentation outlines.In addition, as average numbers of transcript per nucleus are reported it would be good to include the number of cells in each condition.If feasible, more genes should be included to get a better estimate of the fold increase.
We agree with Reviewer 2, and this is something that was also pointed out by Reviewer 1. Please see point 1) of our response to reviewer 1 as well as the update figure.
2) The methods indicate that tissue sections with a thickness between 10 to 20 micrometers are used for this protocol.Are the cDNA-ISS and RNA-ISS conditions performed on the same thickness?Even though transcripts per nuclei are used in the calculation, section thickness could influence the counts, especially because the images are flattened with a maximum projection in Z.It would be helpful if the authors could report the number of nuclei in each section as proxy for the slice thickness.
We thank the reviewer for pointing this out.The experiments to compare RNA-ISS vs cDNA-ISS are performed on consecutive sections taken at the same thickness (10 micra).We clarified this in the main text.
3) Please include a protocol for the cDNA-ISS data used, or a literature reference if the dataset was already published.Also include a description of how the comparison is made in the methods.
We agree this part of the text needed clarification, and thank the reviewer for pointing this out.
The protocol we used for cDNA-ISS is at: https://www.protocols.io/view/hybiss-hybridization-based-in-situ-sequencing-kqdg34357l25/v1We now specified this in the materials and methods section.We generated fresh datasets using sections consecutive to the ones used for RNA-ISS, so to control for RNA degradation and tissue handling.We updated the text accordingly.

4) To further access sensitivity between cDNA-ISS and RNA-ISS an analysis similar to Lee et al. 2022 Scientific Reports
, where the distribution of counts are directly compared (Fig 1b-e) would strengthen the manuscript.Furthermore, since many methods in this field assess their sensitivity by comparing to smFISH based data, this analysis would help place the new method in a common reference frame.Lastly, a comparison to 10X Xenium would also be interesting if possible.
We expanded on the analysis comparison, giving a better overview of the sensitivity increase in a revised Figure 2. We agree with the reviewer that a comparison with smFISH is useful.From previous literature (summarized in https://www.annualreviews.org/docserver/fulltext/genom/24/1/annurev-genom-102722-092013.p df?expires=1713951927&id=id&accname=guest&checksum=16AC1301F35AC812E47B9530F9 0535ED ), the capture efficiency of cDNA-ISS is estimated to be in the order of 1-5%.Assuming the results presented in the paper can be extrapolated fairly uniformly to the entire transcriptome, this would place the efficiency of open-source dRNA-ISS to about 2-10%.
According to a recent benchmarking, Xenium efficiency is likely similar to scRNAseq (30%), and the estimated capture efficiency of smFISH is above 90%.We introduced a brief discussion of these estimates in the text.
The introduction, Figure 1D and online code also covers cell segmentation and clustering of ISS data.However, no demonstrations of these steps are given in the manuscript.An addition of this, for instance on the chicken data, would demonstrate the full pipeline.Because these algorithms wrap existing tools that are independently published (PCIseq, Cellpose and Scanpy) a thorough evaluation is not needed in my opinion, but an example would complete the claims made in the introduction.
We now provide a small test dataset that can be used to try out the pipeline and its different modules.This includes all the steps (preprocessing, decoding, segmentation and single-cell clustering) as well as advanced features (deconvolution / denoising).We hope this satisfies the reviewer's request, increases the reproducibility of our work and softens the learning curve for new users.
Combining We agree with the reviewer that our claim was not strongly supported by the presented data.
In the revised version we now include data from a larger number of protein stainings, collected by performing a CODEX multiplexed protein staining experiment after RNA-ISS.The full dataset is linked in the supplementary material.
The CODEX images can be explored either by using the scroller in the bottom left corner, or clicking on the 'layer' tab and selecting individual images for different antibody stainings.https://lee2024supp.serve.scilifelab.se/mouse_ISS_CODEX.tmap The authors suggest that next to EdU birth dating, other fluorescent reporters can be used in combination with ISS.However, if genetically encoded fluorescent proteins are used, will their fluorescence not interfere with the ISS signal?If they first need to be degraded would this affect the ability to use immunohistochemistry after barcode imaging?
We thank the reviewer for raising these very good points.
The answer to these questions depends on the nature of the specific reporter used.The imaging of the fluorescent reporters can be potentially performed at different stages in the experiment.In some cases (ie.GFP) the three strategies outlined below are valid.
A first option is to image the reporter just after fixation, before the library prep step.A second option is to image the reporter after RCA, but before the first cycle of fluorescent labeling.
In both cases the GFP fluorescence can be significantly reduced by denaturation with a 100% methanol treatment of 10 minutes at room temperature.This methanol-dependent denaturation of fluorescent protein is a well-documented phenomenon, and it is known to affect GFP derivatives as well as many fluorescent proteins in the red range (Tomato, mOrange, etc…).The level of residual background fluorescence might or might not affect the downstream steps, depending on the initial expression levels.The third option is to methanol-denature the fluorescent proteins, proceed to RNA-ISS and posteriorly label the denatured fluorescent protein using immunohistochemistry.This antibody-based strategy is common practice in the Drosophila community when working with methanol-preserved early embryos expressing GFP, which typically lose up to 90% of the GFP fluorescence in the fixation step.
Clearly these strategies can be applied only to methanol-sensitive FPs whose fluorescence can be quenched without actually degrading the protein.If protein degradation is required to quench the reporter's fluorescence, we would suggest starting with library prep, pause the ISS protocol before the first detection cycle, image the desired proteins using immunohistochemistry, degrade the proteins using a proteinaseK treatment, and move on with the detection cycles.
Finally, a fourth option is to use a compatible set of ISS probes and genetically encoded fluorophore, so they are spectrally separated.Depending on the microscope's settings, this might imply having to sacrifice one of the channels used for decoding, perhaps reducing the multiplexing capability, to some extent.We added these suggestions and thoughts to the supplementary manual, in the paragraph: 'Technical notes and considerations for successful ISS experiments'.
The manuscript should better clarify to the reader that the basis for this protocol is HybISS (Gyllborg et al.NAR) rather than the original ISS protocol by Ke et al. 2013 Nature Methods.
Thanks for pointing this out.This is now corrected in the text.
The cross-reactive design of probes between species, is an interesting idea and can save resources in specific experimental situations where closely related species are studied.However, it also suggests that RNA-ISS could suffer from more false positives.Could the authors elaborate on this and include a measurement of false positives?
We thank the reviewer for bringing this subject to our attention.Indeed, this is a matter we often discussed, but we don't have a clear take of what a good measurement of false positives would be.One option we often discussed, and seems to be the approach taken by some vendors in their reagent kits, is the inclusion of a negative control set of probes: probes that are not supposed to bind any target and should give zero signals.We show here a couple of examples of such "negative probes" against EGFP or Gal4, tested on ovaries of non-transgenic flies (hence lacking these reporter genes), showing a very low read count even before quality-filtering.
However, we struggle to see how the use of "negative probes" would intrinsically be more informative than validating the specificity of the assay against existing knowledge, hence our choice of not showing these data in the initial submission.The choice of performing our characterization on the fly ovary, where germ-line and somatic-line marker genes are well known and mutually exclusive, was driven by the need of a strong test case made of mutual internal controls, both positive and negative.We don't think in this context adding EGFP or GAL4 probes as negative control would add any further information.However, we acknowledge this could help in case the use of internal controls is not an option.
Another issue we often debated is what makes a good universal negative control.Or in other words: what's the source of potential experimental noise, and how do we tackle it rigorously in all of our experiments?We believe the sources of noise to be organism-and tissue-specific, and it's not always obvious to us whether a valid universal strategy to prevent noise exists.
Possible causes of noise might be, among others: 1. expression of closely-related paralogues that spuriously bind the probes designed for another gene.We try to address this to the best of our ability in the specificity-check step of probe design, excluding potentially cross-reactive targets within the transcriptome.2. Unexpected expression of pseudogenes or non-annotated genes with high sequence similarity to the chosen target genes.This can be particularly problematic for poorly annotated transcriptomes, but we don't think a good universal solution exists, besides working with transcriptomes that are as complete as possible.Point 1 and 2 are also also discussed in response to Reviewer 1's comment about the root causes of the quantitative differences observed between mouse and rat. 3. Template-independent ligation of padlock probes.All ligases have some small degree of template-independent activity, and this is a potential source of noise.We try to prevent this by separating temporally hybridisation and ligation, and introduce a stringent washing step (neither of which was necessary for DNA-ISS), to remove most of the unhybridized or partially hybridized probes.The fact that cross-reactive design of probes is sometimes possible reinforces the notion that most of the specificity of our RNA-ISS assay is largely deriving from hybridization specificity and that the ligation specificity of the padlock probes plays only a more secondary role, hence our choice to intervene aggressively in the washing step after hybridization.Perhaps negative probes might help address the efficacy of this third point, but we feel this would depend on the specific experiment, model and expected off-targets for the negative probes, which may vary depending on the used model system.We believe it's generally safer to have a couple of validated internal controls.When this is not possible, we agree that negative probes might be a valid approach, but we insist on the need of carefully designing controls for each species, and keep in mind all the potential caveats.
We added these considerations in the Supplementary Manual, in the section 'Technical notes and considerations for successful ISS experiments'.
For the simulation of the Macaque probes the authors note that a human probe panel has 74% matches, 12% has no-match and 14% has off-target matches to the Macaque transcriptome.However, a discussion and conclusion lack whether that would be enough to yield high quality data.
We thank the reviewer for bringing up this, which somehow relates to the puzzling results on the Rat vs Mouse comparison Reviewer 1 has pointed out.
The answer to this question largely depends on the scientist's need.Our suggestion would be to remove all the no-match and off-target probes from the experiment, and focus on the list of genes resulting from the single-matching probes.Is the resulting gene list satisfactory for the needs of the experimenter?One hand, some genes will be excluded from the initial list, and it would be impossible for the scientist to probe them on Macaque, either because of no reactivity or cross-reactivity.Whether these missing genes are or not of crucial importance largely depends on the specific question the scientist will have and how important the analysis of those specific genes is.
On the other hand, a considerable number of genes in the panel will be recognized by a reduced number of probes.We acknowledge that this factor might potentially reduce the number of successful mRNA detections per cell, and produce weaker quantitative results.We estimate that, on average, the global detection efficiency in macaque would be 74% if the detection efficiency in human (assuming comparable gene expression levels and overall RNA integrity), so a notch closer to cDNA-ISS.However, as we've seen in the mouse Vs rat case, many variables might determine how one set of probes translates to a different animal.Depending on the specific question, this level of detection efficiency might or might not be sufficient, and our suggestion would be to do a test run.We'd like to remark, however, that these human probes were not explicitly designed to be cross reactive on Macaques, and this expected efficiency drop could have been prevented by including cross-reactivity as a prior criterion in the probe design phase.
Figure 2 A-E, why is the background so different between the images?Thanks for pointing this out.The background is different because in this specific experiment each gene was labeled with a unique fluorophore, and each channel has a specific signal/noise ratio in a given tissue.This is particularly visible in the 'tj' panel, because these fly ovaries had a very bright autofluorescence in the AF488 channel (the fluorophore we used to detect that gene).
For none of the presented experiments the full dataset with all detected genes is shown simultaneously in a plot.Adding this would be valuable to show the alignment performance of the algorithm and give a better impression of the capabilities of the presented protocol and analysis code.
We think this is a very good suggestion.All the datasets can be now visualized at this permanent link: https://lee2024supp.serve.scilifelab.se/When loading each of the datasets, by default all mRNA are simultaneously plotted to showcase the capability of the method.

Minor comments:
The keyword "Multi-omics" is a stretch because only one protein is detected using immunohistochemistry alongside the spatial transcriptome.
This was also pointed out by reviewer 1.We now added multiple protein stainings based on a RNAISS+CODEX dataset generated to support our "multi-omics" claim.The full ISS+CODEX dataset is visible at https://lee2024supp.serve.scilifelab.se/mouse_ISS_CODEX.tmapWe hope both reviewers agree that the new datasets support the claims.

Introduction:
Clarify that "sequencing-based" methods use Next Generation Sequencing, to prevent confusion with ISS.
This is now corrected in the text.
The authors describe the rolling circle amplification products to be "large", "very bright" and "massively amplified".It would be more informative to specify the (approximate) size, and the amount of amplification.
We agree with the reviewer, and we edited the text accordingly.We provide a size estimate for the amplicon (1 micron), and an amplification estimate (100-1000 copies).
On a similar note, please specify what size range a "large tissue section" is.
We agree with the reviewer, and we included in the text more specific information.We provide a realistic time budget based on our experience, 35 minutes for a 5 color scanning of tissue square of about 0.7 cm by side.We also tabulated imaging times of RNA-ISS against published osmFISH dataset (https://www.nature.com/articles/s41592-018-0175-z) The authors list the detection of heavily fixed mRNAs as one of the limitations of ISS, and in the next paragraph summarize that the manuscript overcomes these limitations.However, no prove (or reference to previous work) is given that indicates that over fixing is an issue.Nor, is it demonstrated that the new protocol overcomes this specific issue.
This is a very good point.The decrease in ISS sensitivity caused by overly fixed samples is something that seems to be assumed knowledge, but we could not find references justifying the claim, except a few papers claiming RNA degradation as a consequence of excessive fixation (ie:https://www.nature.com/articles/srep21418,https://academic.oup.com/nargab/article/6/1/lqae008/7595396?login=true).Intriguingly, the Xenium test datasets released by 10x Genomics also show a reduced efficiency in FFPE samples Vs fresh frozen, but we would argue (in agreement with the reviewer) that this does not directly point to fixation as the problematic step.To our knowledge, the samples were not matchedcontrolled for other factors (ie, storage time).
We removed the claim from the text, and we believe this should be something to be explored more systematically in later work.

Results and discussion:
For the proof of principle experiments in Drosophila, mouse and chicken it would be helpful to see the images of the in situ reference databases side-by-side with the ISS data.A reference to the databases should also be included in the main text (Not only in the figure legend).
We agree with the reviewer.However, especially for Drosophila, the description of the expression patterns in the ovary predates the creation of modern databases.We could find pictures for grk and slbo in the Dresden Ovary Table (http://tomancak-srv1.mpicbg.de/DOT/main)but not for the other genes.We'd like to remark, however, that these genes were chosen precisely because they served as established markers for ovarian cells subpopulations during decades, and are well known to the fly community.If the reviewer agrees, at least for Drosophila, we'd prefer to sum up the information in a reference figure, and refer to the original publications rather than producing side-by-side images, because we might need permission from the publishers to include the original pictures in our paper.Not all the genes we probed are represented in the Allen Brain Atlas, so we opted to plot them side-by-side with the cDNA-ISS expression, and use that as a ground truth.Finally, the data representation in the GEISHA database doesn't always cover the chosen genes for Chicken, particularly for the optic tectum.We rephrased the statement in the relevant paragraph in the text to: "For both species, the decoded expression patterns are localized, specific, and consistent with previous available knowledge whenever existing".

Include additional information on the homology between the engrailed gene in the different species. It would also be helpful to add percentages of nucleotides that differ inside the probed region between the species.
We added the following information to the supplementary data, explaining the difference in the target binding sites across species

Figure 3D-G, please add a label that this is rat brain. Possibly best to do this for all species in all figures.
This is now fixed in the figure.

Material and methods:
The sequences of all probes should be available in a supplemental file.This was pointed out by Reviewer 1 as well.We now added this information.

Manufacturers all of chemicals and reagents should be included. This information is now presented in the text and protocols.io
To which samples was TrueBlack applied?Trueblack was applied to Chicken, the text is now updated Which objective was used to image the RCPs?20x for mouse and rat, 40x for Chicken and Drosophila.We clarified this in the text How many Z-stack were imaged?
Imaging was performed using z-steps of 1um step size at 20x and 0.5 um at 40x.We clarified this in the text.
In the filtering of spots a quality score of 0.5 is used.How is this value determined and is there any ground truth data that is used to establish this value?
The formula for the calculation of the quality score is mentioned in the supplementary manual, but we agree with the reviewer that the rationale for its use should be made more explicit in the main text.
For each individual spot we extract the normalized intensity across all channels for each cycle.For every cycle, we divided the intensity value of the highest intensity channel ('true signal') by the sum of the intensities in all channels.Intuitively, values close to 1 represent a 'pure' signal, while lower values represent progressively uncertain identities.The minimum quality value for a 4-color decoding (as are all the experiments presented in the paper) is 0.25.At the end of the decoding, each spot has n quality scores, where n is the number of detection cycles.In our filtering step (qmin >0.5) we keep the spots only if their quality was consistently alway above 0.5 in all the detection cycles.
From our experience, across a variety of datasets, experiments and imaging conditions, a quality minimum of 0.5 is a threshold that dramatically shifts the ratio between the assigned reads (spots whose identity is strictly and successfully matched against a decoding table) and the nonassigned reads (spots that fail to be assigned one of the allowed identities).We typically plot the assigned Vs non-assigned read counts against some quality indicators (minimum and mean quality across cycle), to make an informed decision about our filtering criteria, and most often we find that a good balance is to filter on a Qmin=0.5.However, of course, this threshold can and should be tuned according to the specific experiment.A guide on how to explore the quality measures in relation to spots assignment is described in the Jupyter notebook in the ISS_decoding module, and the use of the relative functions is described in the Supplementary guide.
We added a brief explanation in the main text.

ISS manual:
In the filename "TileScan 0--Stage01--C03.tif" the index after "Stage" suggests that a maximum of 100 FOVs can be included (Stage00 to Stage99).Is the algorithm limited to 100FOVs?We thank the reviewer for pointing this out.The algorithm is not limited to 100 FOVs.We regularly process samples with >200 FOVs.We clarified this in the manual.
Please give an indication of how large a large dataset is.This will help users prepare the hardware.
An experiment imaging a square of about 0.7 millimeters of side, imaged at 20x magnification using 5+1 colours (5 channels + DAPI) is about 600 GB.The bottleneck in our setup is data transfer, so we suggest a cloud/NAS infrastructure, to transfer the imaging data straight from the microscope onto the analysis workstation.However, in the past we often worked with external hard drives with some success.In this case we would advise using fast SSDs for all the processes from imaging to decoding.
We added these considerations in the Supplementary Manual, in the section 'Computer requirements'.

Code:
The authors could consider putting a minimal raw dataset online for first time users to try.We agree with the reviewer.We have now linked a testable dataset.Because of its relatively large size (50GB) the dataset is hosted at this link https://figshare.com/s/8e0c2bd43a3975fcff4aThe link is currently private, but will be made public and searchable during the publication process in agreement with the publisher, pending acceptance of the manuscript.
We hope these changes satisfy all the reviewer's comments and requests.
On behalf of all the authors, we sincerely appreciated the positive tone of the reviews and the very constructive comments.The overall evaluation is positive and we would like to publish a revised manuscript in Development, provided that the referees' comments can be satisfactorily addressed.Please attend to all of the reviewers' comments in your revised manuscript and detail them in your point-by-point response.If you do not agree with any of their criticisms or suggestions explain clearly why this is so.If it would be helpful, you are welcome to contact us to discuss your revision in greater detail.Please send us a point-by-point response indicating your plans for addressing the referees' comments, and we will look over this and provide further guidance.

Sincerely, Marco Grillo
Please address remaining reviewer concerns.I do not think additional demonstration of the pipeline is needed as suggested by reviewer 2, but you can include if possible.Please address the other concerns of reviewers 1 and 2 -while most can be addressed with additions to the text, there were questions raised regarding new data included in the revision.Please carefully address these concerns with additions to the text and some additional quantification (to build confidence in the new data).
Reviewer 1 Advance summary and potential significance to field I appreciate the author's effort in the revision of the article.Overall, the authors have responded and corrected every point that was requested.The new experiments showcased the power of RNA-ISS against cDNA-based approach and the flexibility of the method to combine it with immunofluorescence or EdU labelling.Below the authors will find few minor modifications.

Comments for the author
Minor modifications -Please, add the references for the following claims "These amplicons are large (approx. 1 micron) and chemically stable, contain concatemers of hundreds of copies of the original probe, …" "To place these values in a common reference frame, smFISH is estimated to have a capture efficiency of over 90%, while droplet based scRNAseq efficiency rounds 30%, according to the 10x Chromium manual, …" "… the last common ancestor of D. melanogaster and D. virilis is estimated to be 40 MYA, a time interval corresponding roughly to the split between old and new world monkeys.""… using a set of padlock probes explicitly designed to work in both species (<20 MY apart), …" "Among these, only the 30mers with a C or G in position 16 are kept: this is because T4RNAl2 has a slight positive efficiency bias towards terminal 3' G or C." In "RNA-ISS recapitulates known mRNA expression patterns with high sensitivity" -The text needs to be rearranged, so the order of the Figures matches the order -Check writing in "… illustrating that RNA-ISS is able to recapitulate the spatial expression of all genes with improved detection efficiency while maintaining specificity with RNA-ISS (Fig. 2C and  2D)." In "RNA-based ISS is compatible with standard labeling techniques" -Please, name the labeled proteins so readers can identify them on TissUUmaps "… we ran RNA-ISS on a mouse coronal section using our new RNA-ISS chemistry on a previously described probe panel (Gyllborg et al., 2020;Ke et al., 2013;Lee et al., 2022) and posteriorly labeled a mouse coronal section with 11 barcoded antibodies, …" Reviewer 2

Advance summary and potential significance to field
See first review.

Comments for the author
The authors have added most of the requested additional information and comparisons to previous/other methods to the manuscript.This places the advances in the proper context and aids the reader to evaluate the work.The option to now interactively explore the data online is a nice addition to the manuscript.
Nevertheless, I still think the manuscript could benefit from a small demonstration of the full analysis pipeline, including cell segmentation and clustering of cells.To give readers a handle on the data quality and to demonstrate the claims made in the introduction.However, this is not an absolute necessity because it can be found on the Github tutorials to some extent, but it would improve the manuscript.
The addition of the new data and analysis raised two concerns that I would like the authors to address.
Point 1, RNA / cDNA comparison.The 9 gene comparison between cDNA-ISS and RNA-ISS and the updated figure 2 is a very helpful addition to the manuscript and convincingly shows that RNA-ISS has higher counts on average.However, I think more proof is needed to show that these are not technical artifacts.
First, please add the standard deviation in the reported mean fold change between RNA-ISS and cDNA-ISS.Especially because the variation in fold change between genes is relatively high.What could the source of the variation be?Second, in Figure 2C and Supplementary Figure 1E there are clearly more detected molecules outside the tissue area in the RNA-ISS condition compared to the cDNA-ISS condition.If these are false positives, this could bias, or in the worst-case dominate, the comparison substantially.Furthermore, also inside the tissue there seems to be more counts in areas where there are no, or few, counts in the cDNA-ISS method.For instance Rorb goes up in ventricle and hippocampus in the RNA-ISS, while this and other datasets do not suggest Rorb should be expressed there.Could the authors comment on the source of these dots?Is this ambient RNA that got stuck on the surface of the glass, or has it another technical source?
To prove the effect of the spurious signal on the fold change, could the authors please count the number of molecules outside the tissue for each of the 9 genes in RNA-ISS and cDNA-ISS, and correlate these values with the per-gene fold change to see if this explains the per-gene variability.
Next, could the authors calculate the count density outside the tissue and subtract this value of the count density inside the tissue, and then repeat the fold-change analysis.In case there is a homogeneous distribution of false-positives over the whole imaged area.
Furthermore, to exclude any other biases could the authors clarify if the exact same probes are used between the two conditions?The updated methods mention that the same number of probes is used, but not whether these are the exact same sequences.
Lastly, are all the settings and parameters used for the image analysis of the two conditions exactly the same?Point 2, Protein co-detection The combination of CODEX and ISS would be extremely powerful to study tissues and I commend the authors for trying to combine these methods.However, while looking at the CODEX images of the 10 proteins (the manuscript mentions 11) on https://lee2024supp.serve.scilifelab.se/mouse_ISS_CODEX.tmap .It seems like the antibody staining had some aspecific signal in the white matter of the brain.All antibodies, except GFAP and ACTB, label the white matter in the same way.Although, I am not familiar with all proteins and their localization, I would expect MOG to be located in the white matter but not others, like the microglia marker MRC1 and gray matter astrocytes marker S100B.Could the authors discuss the quality of the CODEX experiment and address this concern?Furthermore, the GFAP staining presented in Figure 3M looks very similar to the other, potentially aspecific, protein stainings.While the GFAP staining in the CODEX experiment looks more like true GFAP labeling.See for instance: https://doi.org/10.1007/s12031-020-01771-wand https://doi.org/10.1186/2040-2392-2-7In these examples, note that there is no GFAP labeling in the white matter of the striatum, while this is visible in Figure 3M.
Furthermore, it would be nice to see the results of CODEX and ISS in a (supplementary) figure, and not just on the online data viewer.

Second revision
Author response to reviewers' comments On behalf of all coauthors, I wish to thank both reviewers for the insightful comments and suggestions.We attach here a point-by-point response to their comments.We also edited the text for clarity and brevity, as requested by the editorial office.
Reviewer 1: We thank Reviewer 1 for the positive comments and for pointing out the minor reviews addressed below: Please, add the references for the following claims "These amplicons are large (approx. 1 micron) and chemically stable, contain concatemers of hundreds of copies of the original probe, …" We added the following reference: Lizardi et al, 1998 (https://www.nature.com/articles/ng0798_225)"To place these values in a common reference frame, smFISH is estimated to have a capture efficiency of over 90%, while droplet based scRNAseq efficiency rounds 30%, according to the 10x Chromium manual, …" We edited the text, pointing to a single recent reference where the efficiency of the different technologies are systematically reviewed and compared (Salas, 2023) (https://www.biorxiv.org/content/10.1101/2023.02.13.528102v1) "… the last common ancestor of D. melanogaster and D. virilis is estimated to be 40 MYA, a time interval corresponding roughly to the split between old and new world monkeys." We added the relevant reference to the text: https://academic.oup.com/mbe/article/13/1/132/1055488"… using a set of padlock probes explicitly designed to work in both species (<20 MY apart), …" We apologize for the oversight.The divergence time estimate for mouse and rat based on recent molecular data is actually larger (about 30 MY apart, https://academic.oup.com/mbe/article/18/5/777/1018665).We had initially followed an older estimate based on paleontological evidence (https://www.sciencedirect.com/science/article/pii/0047248480900627?via%3Dihub).We corrected the sentence in the text and added the pertinent reference."Among these, only the 30mers with a C or G in position 16 are kept: this is because T4RNAl2 has a slight positive efficiency bias towards terminal 3' G or C." There's no reference in the literature pointing to this bias, the design criterion comes from our initial tests, not shown in the manuscript.However, this bias turned out to be very subtle, so (although this criterion is followed for the probe designed in the present work) we don't implement it anymore in the current lab's workflow, allowing us more flexibility in the probe placement, so to give transcripts a more even coverage across their length.We removed the claim about the efficiency bias from the text.
In "RNA-ISS recapitulates known mRNA expression patterns with high sensitivity" -The text needs to be rearranged, so the order of the Figures matches the order in which they appear on the text (e.g. Fig 2D comes after Fig 2A).
Thanks for pointing this out.The text is now rearranged to match the figure order.
-Check writing in "… illustrating that RNA-ISS is able to recapitulate the spatial expression of all genes with improved detection efficiency while maintaining specificity with RNA-ISS (Fig. 2C  and 2D)." Thanks for pointing this out.We edited the text for clarity.
In "RNA-based ISS is compatible with standard labeling techniques" -Please, name the labeled proteins so readers can identify them on TissUUmaps "… we ran RNA-ISS on a mouse coronal section using our new RNA-ISS chemistry on a previously described probe panel (Gyllborg et al., 2020;Ke et al., 2013;Lee et al., 2022) and posteriorly labeled a mouse coronal section with 11 barcoded antibodies, …" We added a supplementary figure (also requested by Reviewer 2) with the CODEX antibody staining, and added the protein list to the text.We also apologize for a small oversight: for some reason the Synaptophysin staining was not included in the online viewer, but the online dataset is now updated with the missing staining.

Reviewer 2
We thank the reviewer for the evaluation of the new data and for pointing out their valid concerns.We acknowledge that a more detailed analysis might be beneficial to distinguish the real increase in efficiency of RNA-ISS from technical noise.
Here follows a point by point response to the comments: The 9 gene comparison between cDNA-ISS and RNA-ISS and the updated figure 2 is a very helpful addition to the manuscript and convincingly shows that RNA-ISS has higher counts on average.However, I think more proof is needed to show that these are not technical artifacts.First, please add the standard deviation in the reported mean fold change between RNA-ISS and cDNA-ISS.Especially because the variation in fold change between genes is relatively high.What could the source of the variation be?
We thank the reviewer for pointing this out.We added the standard deviation in the reported mean fold change (average fold change =2.38 ; stdev=1.18).
As we mentioned in the previous round of response to reviewers, we believe a more systematic work on the probe design criteria might clarify the reasons behind the source of variation, and hopefully stabilize this variation in the upper range.Of all the factors that might play a role in determining a probe's efficiency, some are known to us and were included in our design pipeline (ie: optimal GC content, absence of homopolymeric stretches, etc…).Other factors are more complex and somewhat less obvious (secondary structure of target mRNAs in the cellular environment, partial pairing between different mRNA species, presence of RNA binding proteins, etc…).We speculate that cDNA-based methods might allow bypassing some of these constraints (ie: Reverse-transcriptases have strong strand displacement activity and might productively resolve secondary structures or RNA pairing, allowing accessibility to RNA regions that would otherwise be tightly locked to probes in RNA-based approaches).We think more work is needed to characterize, understand and stabilize this variation towards the upper range.2C and Supplementary Figure 1E there are clearly more detected molecules outside the tissue area in the RNA-ISS condition compared to the cDNA-ISS condition.If these are false positives, this could bias, or in the worst-case dominate, the comparison substantially.

Second, in Figure
We agree, this is visually striking and the reviewer is right in pointing this out.We don't believe these extra-tissue dots are false positives.Our argument for this is based on 2 elements: first, the level of detection of negative control padlock probes (as shown in the supplementary manual for the GAL4 negative probes) is very low.However, we acknowledge that it might be difficult to generalize results from negative probes, as we discussed in the previous round of revisions.
Our second element against these signals arising from unspecific ligation of padlock probes is that the abundance of mRNA dots outside the tissue is strongly correlated with the detected gene expression levels inside it (ie: high-expressors show high number of extra-tissue dots).One might of course argue, as the reviewer hints, that the high detection efficiency inside the tissue could be an artifact of the false positive ligations happening outside, and the increased sensitivity is a consequence of poor specificity.
Our best argument to solve this potentially circular argument is the following: the amount of extra-tissue dots in RNA-ISS is strongly correlated with the expression levels detected inside the tissue when using cDNA-ISS (see below for a more detailed analysis).To us, this indicates a directional causal link between the gene expression levels and the abundance of extra-tissue spots and suggests that the extra-tissue signal is real mRNA, most likely coming from diffusion/smearing of (partially degraded?)RNAs during the cutting and handling of the sections.In our experience, a similar effect is also often visible when using the CARTANA HS-ISS kit, especially on high expressors, and we believe it's a feature of RT-free methods, which are probably less impaired by mRNA fragmentation than cDNA-based approaches.
Furthermore, also inside the tissue there seems to be more counts in areas where there are no, or few, counts in the cDNA-ISS method.For instance Rorb goes up in ventricle and hippocampus in the RNA-ISS, while this and other datasets do not suggest Rorb should be expressed there.
We understand the reviewer's concern.We warn, however, against using cDNA-ISS as ground truth: because of its lower capture efficiency, lack of detection might sometimes be confused with lack of expression.The same applies to traditional chromogenic in-situ hybridisation, where a low-level staining is sometimes difficult to interpret in terms of absolute number of molecules Vs background noise.
As the reviewer suggests, these Rorb dots outside of the expected locations might be indeed diffusing mRNAs from nearby locations.However, we'd like also to point out that that Rorb's mRNA seems to be expressed (although at relatively low levels) in the hippocampus, according to these scRNAseq and MERFISH datasets: scRNAseq: https://rb.gy/f118vjMERFISH: https://rb.gy/x7nw3s Could the authors comment on the source of these dots?Is this ambient RNA that got stuck on the surface of the glass, or has it another technical source?
Ambient RNA that gets stuck on the surface of the glass is certainly problematic, as we have discussed above when addressing the presence of dots outside the tissue.However, with the current data we can't surely tell whether that's the causative factor for Rorb detection, at least for the case of the hippocampus.
To prove the effect of the spurious signal on the fold change, could the authors please count the number of molecules outside the tissue for each of the 9 genes in RNA-ISS and cDNA-ISS, and correlate these values with the per-gene fold change to see if this explains the per-gene variability.
To perform the suggested analysis we used TissUUmaps to manually draw a ROI around the tissue using the DAPI signal as a reference to outline the tissue area.This is what we call "inside the tissue" in the following paragraphs.We then designed a second ROI spanning the actual imaged area, avoiding the black regions added by the stitching process.Creating this second ROI is crucial to make sure the analysis is not biased by an uneven representation of black areas in the images across conditions.The area contained between the external and internal ROI is called "outside the tissue" in the following paragraphs.We then used these ROI masks to assign a categorical variable to the reads, representing whether each read is contained in the "inside" or in the "outside" the tissue areas.We finally performed a correlation analysis on a gene-by-gene level.
The results are as follows: 1.The number of "outside" reads of RNA-ISS per gene and the corresponding raw fold change (dRNA/cDNA all slide) are not correlated (Pearson Correlation: -0.39, P-value: 0.29).2. The ratio between RNA/cDNA "outside" counts does not correlate with the fold change (Pearson Correlation: -0.02, P-value: 0.95).3. The ratio between RNA/cDNA "inside" counts correlates with the fold change (Pearson Correlation: 0.99, P-value: 5*10ˆ-15) We believe these results suggest that most of the total fold-change is explained by reads on the tissue, and that the reads outside the tissue contribute only marginally to the fold change increase.
Finally, we noticed how the "outside counts" for a given gene seemed to be correlated with its expression inside the tissue (high expressors tend to show more extra-tissue dots).To rule out the circular argument exposed in the previous paragraph, we assessed the correlation between "RNA-ISS outside counts" Vs "cDNA-ISS inside counts".This should allow us to inspect a causal link between the levels of gene expression and its extra-tissue detection abundance.These 2 measures are indeed correlated (Pearson Correlation: 0.88, P-value: 0.001).
To us this suggests that the "outside reads" in RNA-ISS are proportional to the real gene expression levels and likely arising from real mRNA detection events.We speculate, as hinted above, that these signals are detected because RNA-ISS might be more capable than cDNA-ISS of detecting partially degraded mRNAs sticking to the glass.
Next, could the authors calculate the count density outside the tissue and subtract this value of the count density inside the tissue, and then repeat the fold-change analysis.In case there is a homogeneous distribution of false-positives over the whole imaged area.
As outlined before, we created the 2 ROIs to cover the tissue and the extra-tissue space.We then used the ROI contours to calculate the total area of each ROI across conditions and, for each gene, computed the read density by area unit, inside and outside the tissue.
As suggested by the reviewer, we then subtracted the "outside density" from the "inside density", both for cDNA and dRNA.We then analyzed the ratio between the dRNA "corrected density" Vs its cDNA counterpart.This correction reduced the fold change increase of RNA-ISS vs cDNA-ISS to about 1.96 from an initial value of 2.38.Also, it reduced the standard deviation to 1.04 from an initial 1.18.
Furthermore, to exclude any other biases could the authors clarify if the exact same probes are used between the two conditions?The updated methods mention that the same number of probes is used, but not whether these are the exact same sequences.
cDNA-ISS and RNA-ISS probes have different sequence requirements, so the chosen target sequences are different by design.We picked at random, among all the allowed targets on mRNA or cDNA, the same number of probes for both techniques, without any particular matching criterion.We acknowledge this might explain at least part of the observed variation, and perhaps a systematic analysis would reveal that the most efficient RNA-ISS probes have some distinguishing feature that we haven't accounted for and could be included in a later design optimization.
We added the following paragraph to summarize the analysis described above: A careful analysis of the detected signal spots across methods revealed that RNA-ISS captured a higher number of spots outside the tissue compared to cDNA-ISS (FIG supp?), prompting the need to rule out technical artifacts.the higher sensitivity of RNA-ISS does not seem to be explained by this increased detection outside the tissue (Pearson Correlation: -0.39, P-value: 0.29), but does instead correlate with the RNA/cDNA ratios inside it (Pearson Correlation: 0.99, P-value: 5*10ˆ-15).This suggests that reads outside of the tissue have a minor impact on the fold change increase.Furthermore, RNA-ISS counts outside the tissue are correlated with the expression levels inside it (as detected by cDNA-ISS) (Pearson Correlation: 0.88, P-value: 0.001), suggesting that these spots might be mRNA molecules smeared over the glass during the sample handling, and are not artifacts generated by a spurious activity of the ligase.
Lastly, are all the settings and parameters used for the image analysis of the two conditions exactly the same?
Yes, the images were acquired in exactly the same way, on the same day, same microscope with identical imaging settings (most notably, exposure time and LED power), on consecutive tissue sections.The analysis was performed exactly in the same way, with unchanged spot detection settings across methods.This is now clarified in the text, under "cDNA/RNA ISS comparison", by the addition of the following text: The images were acquired on consecutive sections, using the same microscope, unchanged imaging settings, and analyzed using the same image analysis pipeline with identical detection criteria.
Part 2 codex experiments: We thank the reviewer for their appreciation of our attempt at integrating CODEX and ISS on the same samples.We acknowledge the issue with the background autofluorescence, and we wish to clarify our interpretation of the data: this issue seems more of a CODEX-specific problem than an artifact due to performing CODEX after ISS.
All antibodies were initially tested in a regular immunofluorescence experiment (ie.using secondary antibodies for detection).After this first validation, the antibodies were then conjugated with DNA barcodes and tested again, this time using the DNA barcodes for staining.With this multiple step selection process, we aimed to ensure the selection of high quality and specific protein targets.However, in a few cases, as the reviewer observed, we could see background autofluorescence in the white matter after conjugation (specifically in the corpus callosum region).We chose to incorporate these markers to the panel because they were highly specific in other regions of the brain.Examples for these are CD31, LMNB1 and Ki67.The CD31 staining apart from the white matter autofluorescence was highly specific with visible vessel structures overlapping with descriptions in the literature (1).On the same line, the LMNB1 antibody stained the nuclear lamina specifically (visible around the nucleus).Ki67 antibody was also specific with the nuclear staining pattern in the lateral ventricular area matching previous descriptions in the literature (2) (attached Figure 1).Apart from these three markers, in the MRC1 staining, autofluorescence in the white matter is most likely an artifact of over-exposure, as MRC1 is not an abundant protein in the healthy mouse brain.Also, MRC1 is used as a marker for border-associated macrophages (BAMs) rather than microglia.In our stainings, we see BAMs in the outer layer of the brain, which is in concordance with the literature (3) (attached Figure 2).
All the other markers (CNP, NEFL, S100B) apart from the ones mentioned by the reviewer as specific stainings (ACTB, GFAP, MOG) are also highly specific.Similar to MOG, CNP is a marker associated with myelin sheaths.As the corpus callosum is rich in myelinated axons, the observed CNP staining in the corpus callosum region is as expected which is overlapping with the literature (https://www.proteinatlas.org/ENSG00000173786-CNP/brain).NEFL is a marker for neurofilaments which are important components of the neuronal cytoskeleton.The corpus callosum contains numerous axons that are rich in neurofilaments which also explains the staining we observe in the corpus callosum region.The NEFL staining is also overlapping well with the literature available (https://www.proteinatlas.org/ENSG00000277586-NEFL/brain).In addition, we observed a minor autofluorescence in corpus callosum region in the S100B staining and the patterns we observed in the other regions of the brain was overlapping with the literature (https://www.proteinatlas.org/ENSG00000160307-S100B/brain).
We are still working on improving our stainings and incorporating applications to remove the autofluorescence arising from antibody conjugation of DNA barcodes.We hope this integration experiment can be considered as a step towards multi-modal integration approaches.
To clarify, we added the following text to the Materials and Methods section, under "Antibody conjugation": In a few cases we could detect some background autofluorescence in the white matter (specifically in the corpus callosum), arising after DNA conjugation to the antibodies.In these cases we chose to use the antibodies whenever they retained highly specific labeling in other regions of the brain.Examples for these are CD31, LMNB1 and Ki67.
We then added the following text in the results, when presenting the CODEX+ISS data: For some of the antibodies we could detect, together with the specific labeling, some unspecific staining in the corpus callosum (Supplementary Figure 3).This unspecific staining seems to arise as a consequence of the DNA-conjugation to the CODEX antibodies, and it's not specifically produced by the combined CODEX+ISS workflow (see materials and methods, "Antibody conjugation" paragraph)  Regarding the GFAP staining in Figure 3M, the reviewer is right in pointing out those concerns.Something seems off with that particular staining, and we apologize for not spotting the problem earlier.We went back to the raw data and checked a replicate of the same experiment on a consecutive section, which shows a more convincing staining.We have now edited figure 3M to show this replicate instead.
Finally, we included a demonstration of data clustering and visualization in the Optic Tectum TissUUmaps data viewer.Because the cell density in the optic tectum is very high, instead of Cellpose-based segmentation we chose to bin the reads into hexagonal regions of size 100 micrometers.In our experience, Cellpose doesn't perform optimally on such cell-dense images, and leads to segmentation artifacts, unless specifically re-trained.
We then proceeded to analyze and cluster these bins.When loading the Optic Tectum datasets, now by default the clustered bins are also plotted on the image, and the user can opt to see them and switch between this view and the raw reads.Through the use of the DGE plugin, the user can visualize the differentially expressed genes for each cluster interactively.The UMAP can be visualized interactively using the FeatureSpace plugin.
We hope our response satisfies all the reviewers' requests.1) Chen MB, Yang AC, Yousef H, Lee D, Chen W, Schaum N, Lehallier B, Quake Second decision letterMS ID#: DEVELOP/2023/202448 MS TITLE: Open-source, high-throughput targeted in-situ transcriptomics for developmental and tissue biology AUTHORS: Hower Lee, Christoffer Mattsson Langseth, Sergio Marco Salas, Sanem Sariyar, Andreas Metousis, Eneritz Rueda Alana, Christina Bekiari, Emma Lundberg, Fernando Garcia Moreno, Marco Grillo, and Mats Nilsson I have now received all the referees reports on the above manuscript, and have reached a decision.The referees' comments are appended below, or you can access them online: please go to BenchPress and click on the 'Manuscripts with Decisions' queue in the Author Area.
in which they appear on the text (e.g.Fig 2D comes after Fig 2A).

Figure 1 -
Figure 1-Higher Ki67 staining in the lateral ventricular area.

Figure 2 -
Figure 2-BAMs in the outer region of the brain tissue.
Nilsson group already demonstrated that this was possible (Mbp protein inLee et al. 2022  Scientific reports).Here this is again demonstrated with the GFAP protein.A stronger case would be made if multiple proteins are detected, because it is likely that not all epitopes are equally well preserved during the ISS protocol.However, this is a major effort and therefore toning down the claims would also suffice.