Transcript
On Generation and Applications of High-Density Protein Microarrays
Ronald Sjöberg
Doctoral Thesis Kungliga Tekniska Högskolan Royal Institute of Technology School of Biotechnology Stockholm 2015
© Ronald Sjöberg 2015 Affinity Proteomics Science for Life Laboratory Division of Proteomics and nanobiotechnology School of Biotechnology KTH, Royal Institute of Technology Tomtebodavägen 23A 171 65 Solna, Sweden ISBN 978-91-7595-573-7 TRITA-BIO Report 2015:12 ISSN 1654-2312 Printed by US-AB
Abstract Affinity proteomics has experienced rapid development over the last two decades and one of the most promising platforms to emerge are the protein microarrays. The combination of affinity reagents and miniaturisation enables assays for simultaneous high throughput and sensitive protein analysis. Due to the combination of these desirable properties, a multitude of protein array platforms for rapid and efficient study of proteomes and protein interactions are in use today. Although the protein microarray field has more than two decades of history to look back on the development of new protein microarray platforms continues to this day and beyond. In the paper I in this thesis, a microarray of eluates from dried blood spot samples collected from neonates were designed and utilised for detection of complement factor 3 (C3) deficiency. The data acquired from the microarrays platform were compared to C3 levels obtained through enzyme-linked immunosorbant assay (ELISA), and the microarray assay were found to separate the C3 deficient samples from the controls. The conclusion of this investigation was that the microarray platform would be suitable for high-throughput screening of C3 deficiency in neonates. Paper II outlines the work in developing a multiplex platform for validation of affinity reagents. A set of 398 affinity binders, originating from five research groups, were profiled against 432 antigens and representing both polyclonal rabbit antibodies, monoclonal mouse antibodies, and recombinant single-chain variable fragments. Approximately 50% of the binders were found to preferably recognise their intended target while 10% of the binders did not generate any, or low, signals with their respective targets. For paper III, a reverse phase array (RPPA) platform using fluorescence-based detection of IgA deficiency in over 2.000 samples where validated on a label-free detection system and ELISA. The data from the label-free platform and the RPPA were found agree well with each other while data from ELISA did with neither of them. It was found that the label-free platform proved to be well-suited for detection of IgA in serum. Paper IV describes one of the world’s largest protein microarrays containing 21.120 recombinant protein fragments. We describe some of the possible applications of these large-scale arrays, such as binding profiles for the validation of antibodies with 11.520 and 21.120 recombinant proteins, as well as screening for autoimmunity in human serum samples.
i
Populärvetenskaplig Sammanfattning Allt levande, från virus och bakterier, till växter, människor, och andra djur, består till stor del av något som kallas proteiner. Dessa proteiner sköter om alla de processer som sker i och mellan celler som gör att cellerna och de vävnader de ingår i, samt vi som tänkande och levande varelser, kan leva och föröka oss. När en individ blir sjuk, eller på något sätt förändras, så är det möjligt att se det som en förändring i proteinernas produktion och samverkan. Detta gör att det är av mycket stort intresse att kunna studera uttrycket av proteiner. Det finns dock ett mycket stort antal olika proteiner, i människan uppskattar man det till att vara över hundratusen, varav många proteiner är väldigt lika varandra och därmed svåra att åtskilja. Detta innebär att studiet av proteiners uttryck och deras samspel kan vara mycket utmanande, framförallt i biologiskt härledda vätskor såsom blod där nästan alla proteiner finns närvarande, fastän med mycket stora koncentrationsskillnader. Ett sätt att ändå göra detta är att använda proteinernas förkärlek för att interagera med varandra, och då framför allt en viss typ av proteiner som kallas för antikroppar. Antikroppar är grupp av proteiner som produceras av ryggradsdjur och vars uppgift i kroppen är att urskilja och känna igen specifika proteiner även i väldigt komplexa proteinblandningar som t.ex. blod. Genom att använda sig av antikroppar, och andra biomolekyler som fungerar på ungefär samma sätt, så kan man då studera proteinsammansättningen även i dessa komplexa biologiska prover. Eftersom det finns så många proteiner vill man ofta studera många proteiner samtidigt och ett sätt att gör det är genom att använda sig av så kallade mikromatriser. En matris består av kända enheter ordnade i rutmönster med kända positioner och en mikromatris är följaktligen en matris på mikrometerskala. Vad det innebär i praktiken är att man använder specialiserade instrument för att placera väldigt små mängder av olika proteiner på fasta underlag i små matriser och på detta sätt så kan man utföra analyser på tiotusentals proteiner eller prover samtidigt. När man går ner i skala till mikroskalan, och samtidigt upp antal samtidiga analyser till tusentals, så får man dels både speciella problem och samtidigt helt nya möjligheter jämfört med traditionella analysmetoder. Det är arbetet med att dels lösa dessa problem samt utforskande av dessa fantastiska möjligheter som denna avhandling behandlar. ii
This thesis will be defended on June 12 2015 at 10:00 in Inghe lecture hall in “Widerströmska huset” (Tomtebodavägen 18A, Karolinska Institute, Solna Campus) for the degree of “Teknologie doktor” (Doctor of Philosophy, PhD) in Biotechnology at KTH – Royal Institute of Technology. Respondent: Ronald Sjöberg, MSc in Biotechnology Affinity Proteomics, Science for Life Laboratory Division of Proteomics and Nanobiotechnology, School of Biotechnology KTH - Royal Institute of Technology, Stockholm, Sweden Faculty opponent: Dr. Michael Taussig Protein Technology Group, The Babraham Institute, Cambridge, UK Cambridge Protein Arrays Ltd., Babraham Research Campus, Cambridge, UK Evaluation committee: Ass. Prof. Dr. Anders Andersson Environmental Genomics, Science for Life Laboratory Division of Gene Technology, School of Biotechnology KTH - Royal Institute of Technology, Stockholm, Sweden Ass. Prof. Dr. Sara Lind Mass-Spectrometry Based Proteomics, Science for Life Laboratory Analytical Chemistry, Department of Chemistry – BMC Uppsala University, Uppsala, Sweden Ass. Prof. Dr. Lukas Orre Clinical Proteomics Mass Spectrometry, Science for Life Laboratory Department of Oncology-Pathology, Cancer Proteomics Karolinska Institute, Stockholm, Sweden Chairman: Prof. Dr. Joakim Lundeberg National genomics Infrastructure, Science for Life Laboratory Division of Gene Technology, School of Biotechnology KTH – Royal Institute of Technology, Stockholm, Sweden Main Supervisor: Prof. Dr. Peter Nilsson Affinity Proteomics, Science for Life Laboratory Division of Proteomics and Nanobiotechnology, School of Biotechnology KTH - Royal Institute of Technology, Stockholm, Sweden Co-supervisor: Ass. Prof. Dr. Jochen M Schwenk Affinity Proteomics, Science for Life Laboratory Division of Proteomics and Nanobiotechnology, School of Biotechnology KTH - Royal Institute of Technology, Stockholm, Sweden
iii
List of Publications The presented thesis is based on the following articles, referred to by their roman numerals. All articles are included in the Appendix.
Article I Magdalena Janzi, Ronald Sjöberg, Jinghong Wan, Björn Fischler, Ulrika von Döbeln, Lourdes Isaac, Peter Nilsson, Lennart Hammarström, (2009), Screening for C3 Deficiency in Newborns Using Microarrays, PLoS One, 4(4):e5321, doi: 10.1371/journal.pone.0005321.
Article II Ronald Sjöberg*, Mårten Sundberg*, Anna Gundberg, Åsa Sivertsson, Jochen M. Schwenk, Mathias Uhlén, Peter Nilsson, (2012), Validation of Affinity Reagents Using Antigen Microarrays, New Biotechnology, 15;29(5):555-63. doi: 10.1016/j.nbt.2011.11.009. * Equal contribution
Article III Ronald Sjöberg, Lennart Hammarström, Peter Nilsson, (2012), Biosensor Based Protein Profiling on Reverse Phase Serum Microarray, Journal of Proteomics & Bioinformatics, 5: 185189. doi: 10.4172/jpb.1000233.
Article IV Ronald Sjöberg, Cecilia Mattsson, Eni Andersson, Cecilia Hellström, Heng Zhu, Mathias Uhlén, Jochen M. Schwenk, Burcu Ayoglu, Peter Nilsson, Exploration of High-density Protein Microarrays for Antibody Validation and Autoimmunity Profiling, (manuscript).
iv
Author’s Contribution to the Included Articles The contribution of the author of this doctoral theses to the included articles is as follows: Paper I Experimental planning, development and production of the array platform, analysis of the data obtained from the array platform and manuscript writing as co-responsible author. Paper II Study design, experimental planning and performing array-based laboratory work, data analysis and manuscript writing as co-responsible author. Paper III Experimental planning, development and production of arrays for both array-based platforms, performing all of the label-free array-based laboratory work, all data analysis, and manuscript writing as main contributor. Paper IV Experimental planning, development and production of protein fragment arrays, all the laboratory work and data analysis, and writing as the main contributor.
v
Related Articles Affinity proteomics discovers decreased levels of AMFR in plasma from Osteoporosis patients. Qundos U, Drobin K, Mattsson C, Hong MG, Sjöberg R, Forsström B, Solomon D, Uhlén M, Nilsson P, Michaëlsson K, Schwenk JM. Proteomics Clin Appl. 2015 Feb 16. doi: 10.1002/prca.201400167. A lateral flow protein microarray for rapid and sensitive antibody assays. Gantelius J, Bass T, Sjöberg R, Nilsson P, Andersson-Svahn H. Int J Mol Sci. 2011;12(11):7748-59. doi: 10.3390/ijms12117748. A roadmap to generate renewable protein binders to the human proteome. Colwill K, Persson H, Jarvik NE, Wyrzucki A, Wojcik J, Koide A, Kossiakoff AA, Koide S, Sidhu S, Dyson MR, Pershad K, Pavlovic JD, Karatt-Vellatt A, Schofield DJ, Kay BK, McCafferty J, Mersmann M, Meier D, Mersmann J, Helmsing S, Hust M, Dübel S, Berkowicz S, Freemantle A, Spiegel M, Sawyer A, Layton D, Nice E, Dai A, Rocks O, Williton K, Fellouse FA, Hersi K, Pawson T, Nilsson P, Sundberg M, Sjöberg R, Sivertsson Å, Schwenk JM, Takanen JO, Hober S, Uhlén M, Dahlgren LG, Flores A, Johansson I, Weigelt J, Crombet L, Loppnau P, Kozieradzki I, Cossar D, Arrowsmith CH, Edwards AM., Gräslund S. Nat Methods. 2011 May 15;8(7):551-8. doi: 10.1038/nmeth.1607. Validation of serum protein profiles by a dual antibody array approach. Rimini R, Schwenk JM, Sundberg M, Sjöberg R, Klevebring D, Gry M, Uhlén M, Nilsson P. J Proteomics. 2009 Dec 1;73(2):252-66. doi: 10.1016/j.jprot.2009.09.009. Selective IgA deficiency in early life: association to infections and allergic diseases during childhood. Janzi M, Kull I, Sjöberg R, Wan J, Melén E, Bayat N, Ostblom E, Pan-Hammarström Q, Nilsson P, Hammarström L. Clin Immunol. 2009 Oct;133(1):78-85. doi: 10.1016/j.clim.2009.05.014. Anoctamin 2 protein as an autoimmune target in multiple sclerosis. Burcu Ayoglu, Nicholas Mitsios, Ingrid Kockum, Mohsen Khademi, Björn Forsström, Ronald Sjöberg, Johan Bredenberg, Izaura Lima, Eric Holmgren, Hans Grönlund, André Ortlieb Guerreiro Cacais, Nada Abdelmagid, Mathias Uhlén, Tim Waterboer, Lars Alfredsson, Jan Mulder, Jochen M. Schwenk, Tomas Olsson, Peter Nilsson (submitted). Affinity proteomics in plasma within Hodgkin and diffuse large B-cell lymphoma identifies proteins for disease identification. Frauke Henjes, Claudia Fredolini, Kimi Drobin, Davide Tamburro,Mun-Gwan Hong, Björn Forsström, Ronald Sjöberg, Elin Birgersson,Eni Andersson, Henrik Hjalgrim, Ingrid Glimelius, Janne Lehtiö, Jacob Odeberg, Mathias Uhlén, Bengt Glimelius, Peter Nilsson, Karin E. Smedby and Jochen M. Schwenk (In revision)
vi
Preface Many years ago I was standing in a dusty construction site looking at a steel beam that I, then a carpentry apprentice, and my team partner, the carpentry master, were supposed to fire-proof according to set specifications. Not only did this require a specific set of fire-proofing goals to be reached, but those goals had to be reached while still making the beam fit in to the space defined by the blueprint. Due to the fact that those who makes blueprints quite often live in a different world than those who has to make the blueprints become reality some problem solving was needed to achieve the set goals. As my master were complaining about the impossibility of the task ahead I was silently contemplating how construction work is mostly problem solving and how some people got paid a lot more for solving problems, and without having to get dusty and dirty every day and retire early from worn out backs, shoulders and knees. As I explained to my master how the work should be done I was slowly starting to realise that maybe I should do something else in my life, maybe something like engineering. At that time the construction business was in a slump and I kind of enjoyed the downtime that was common when between jobs, as that gave me plenty of time to read interesting books instead of breathing asbestos filled air in badly run construction sites. But nothing lasts forever and once I myself became a carpentry master and when the construction business once more started to get busy I went back to school to see what else there was to learn, and if I could get enough grades to enter university to escape the harsh reality of manual labour. I already had my sights set on engineering but I had not decided on whether it would be in mechanics or chemistry. But as I got a lower grade in physics, due to the physics teacher getting a heart attack during test grading and my test never being found by the substitute teacher, I decided that chemistry was the way to go. After a few interesting years at university I finished my master thesis at the School of Biotechnology at KTH and started working in what then was called the “PrEST Array group” where we were slowly trying to build up the protein microarray platform for validation of Human Protein Atlas antibodies. This supplied me with ample opportunities to do some interesting problem-solving besides the mindnumbing routine antibody dilutions, and although I was constantly planning to leave Stockholm and return to my home town I never seemed to be able to leave. I guess the longer and darker winters and my lack of interest in heavy industry, snowmobiles, winter sports, or hunting contributed to my extended stay in Stockholm. I quickly discovered that working with a technology that is so widely useful seem to always supply you with new ways of being creative and to discover new amazing possibilities, especially if you are free to improvise and to perform whatever test you wish without the need of immediate publishable results. Over the years I acquired the knowledge and knowhow, which have now resulted in this short text, in an almost organic way and only a small part of it could be fit between these book covers. The most important knowledge that I have acquired since I started working with arrays is not really the technical details concerning microarrays, but rather a way of thinking that I was never taught in school. Unfortunately that is not easily translated into text, so general technical details of protein microarrays will have to do. Owing to the diverse nature of the field of protein microarrays I can only describe a small part of it. However, I have tried to include the three microarray variants that I have come in direct contact with, but this is in no way a complete description. I hope that you will enjoy reading this book just as much as I enjoyed writing it, because life should be about having fun.
vii
Table of Contents Abstract .................................................................................................................................................... i Populärvetenskaplig Sammanfattning .....................................................................................................ii List of Publications...................................................................................................................................iv Author’s Contribution to the Included Articles ........................................................................................v Related Articles ....................................................................................................................................... vi Preface.................................................................................................................................................... vii
Introduction ............................................................................................................................................ 1 From DNA to Proteomes ........................................................................................................................ 2 Proteins ............................................................................................................................................... 3 Proteome ............................................................................................................................................ 4 The Complexity of Proteomes ........................................................................................................ 4 Affinity Reagents .................................................................................................................................... 5 Antibodies ........................................................................................................................................... 6 Antibody Isotypes ........................................................................................................................... 7 Polyclonal Antibodies ..................................................................................................................... 8 Monoclonal Antibodies .................................................................................................................. 9 Antibody Derivatives .......................................................................................................................... 9 Other Affinity Reagents .................................................................................................................... 10 Proteomics ............................................................................................................................................ 11 Affinity Proteomics ........................................................................................................................... 11 Human Protein Atlas..................................................................................................................... 13 Protein Microarrays .............................................................................................................................. 15 Protein Microarray Formats ............................................................................................................. 17 Antigen Arrays .............................................................................................................................. 18 Capture Arrays .............................................................................................................................. 19 Reverse Phase Arrays ................................................................................................................... 21 Technical Considerations for Protein Microarrays .............................................................................. 22 Contact Printing ................................................................................................................................ 23 Non-contact Printing ........................................................................................................................ 25 In Situ Synthesis ............................................................................................................................ 26 Substrates ......................................................................................................................................... 27 Detection Principles .......................................................................................................................... 30 Label-based Assays ....................................................................................................................... 30 viii
Label-free Assays .......................................................................................................................... 32 Blocking and Background ................................................................................................................. 33 Local background .......................................................................................................................... 33 Global Background........................................................................................................................ 34 Multiplexing ...................................................................................................................................... 36 Image Analysis .................................................................................................................................. 38 Concluding Remarks ............................................................................................................................. 40 Present Investigations .......................................................................................................................... 43 Paper I - Screening for C3 Deficiency in Newborns Using Microarrays. ......................................... 43 Aims of the investigation.............................................................................................................. 43 Summary of findings ..................................................................................................................... 43 Results and conclusions ................................................................................................................ 44 Paper II - Validation of Affinity Reagents Using Antigen Microarrays............................................ 45 Aims of the investigation.............................................................................................................. 45 Summary of findings ..................................................................................................................... 46 Results and conclusions ................................................................................................................ 47 Paper III - Biosensor Based Protein Profiling on Reverse Phase Serum Microarray. ..................... 48 Aims of the investigation.............................................................................................................. 48 Summary of findings ..................................................................................................................... 48 Results and conclusions ................................................................................................................ 49 Paper IV - Exploration of High-density Protein Microarrays for Antibody Validation and Autoimmunity Profiling. ................................................................................................................... 51 Aims of the investigation.............................................................................................................. 51 Summary of findings ..................................................................................................................... 51 Results and conclusions ................................................................................................................ 52 Future Perspectives .............................................................................................................................. 54 Acknowledgements .............................................................................................................................. 55 References ............................................................................................................................................ 56
ix
Introduction
Introduction Most organism that we usually think about as “alive” consists of one or more cells. These cells are composed of a lipidlayer encapsulating a biomolecular machinery and the information describing how the biological machinery should be constructed and function. This information about what the cell is capable of is stored in genes encoded in what is called the genome. The genome is, on a basic level, both the beginning and the end of the organism as it would not exist without it, and once the cell has seized to exist new copies of the information will have been transferred to new generations of the cell. Even though the information regarding the structure and function of the cell is stored in its genome, what the cell does, and how the information is ultimately expressed, is most accurately gleaned through study of its biomolecular machinery. In the end, the activity and regulation of this biomolecular machinery is dominated by what we call proteins. What this means is that as biomolecular changes happen in a cell, it will be possible to detect it as changes in the protein content of the cell. As cells and tissues grow, age, or become afflicted by disease, the changes in protein composition and concentration will be an indicator of what is happening inside the organism. This window into life and death makes the study of protein content of cells, tissues, or bodily fluids very interesting to anyone who strives to understand more about the biology of life. If the goal is to measure the general protein content of a biological sample, a classical assay such as a Bradford assay, SDS-gel assay, or enzyme-linked immunosorbent assay will probably be able to deliver what is needed. But if the goal is to measure samples and proteins on a large scale, and enter proteomics, the use of affinity proteomics in the form of protein microarrays is a powerful choice. These protein microarrays, and their construction and use, will be the subject of this thesis.
1
From DNA to Proteomes
From DNA to Proteomes DNA, or “Deoxyribose Nucleic Acid”, constitutes the information storage of cells, and was first discovered as an unknown substance in the nuclei of cells by Friedrich Miescher in 1869. The following work on the characterisation of this substance found it to be composed of four basic components, adenine, thymine, cytosine and guanine, and that the common structure and role of DNA is a double-stranded polymer1 that stores the hereditary information in cells2. These components are called ‘nucleotide bases’ and are arranged in stretches that contains sub-sequences called ‘genes’, and each gene can code for one or several functional products made out of ‘proteins’. In the sequences between these protein-encoding genes resides regulatory elements that controls the expression genes and other non-protein functional products3. The genetic information is expressed by transcription into RNA (RiboNucleic Acid), where thymine is exchanged for uracil. RNA can fulfil several roles in a cell, one being the transfer of information on protein structure from DNA into protein. Different prefixes or suffixes are used depending on which role the transcribed RNA fulfils, and mRNA (messenger RNA) serves the role of being the messenger between information stored in DNA and expression of protein4. Proteins on the other hand uses twenty different molecules, all belonging to a class of molecules called amino acids. This increase in variation increases the number of possible combinations and offers much larger variability in the structure of proteins then is possible with DNA and RNA. ‘The central dogma’ describes the transfer of information stored in DNA through transcription to mRNA and translation into proteins. It was first postulated by Francis Crick in 1958 and is a negative statement saying that “once (sequential) information has passed into protein it cannot get out again”, or in a more simplified way, that transfer of information from protein back to DNA does not exist. DNA and RNA can both be replicated and in some cases information stored in RNA can flow backwards to DNA, but in no cases can information stored in proteins flow back to either RNA or DNA5. However, the transcription of DNA is in many ways regulated by proteins and so the transcription, translation, and regulation finally becomes a circle. This circle is an important part of what we call ‘life’ and the study of this circle is incorporated into the research field called ‘life-science’, of which protein microarrays is a part.
2
From DNA to Proteomes
Proteins Proteins were recognized as a special class of multicomponent molecules as early as mideighteen century. However, a more accurate description did not appear until 1838, when a Dutch chemist named Gerardus Johannes Mulder could show that this specific class of molecules was mainly constructed by six basic elements: carbon, hydrogen, amine, oxygen, phosphorus, and sulphur. The term ‘protein’ originates from the Greek word ‘proteios’ which means ‘primary’, and was coined not long after by his associate Jöns Jacob Berzelius as a reference to it being of foremost importance for the living body. Proteins comprise 20 amino acids and each amino acid consists of a carbon scaffold with amine and carboxyl functional groups, and a side chain specific for each type of amino acid. Stringed together as chains these amino acids form the primary protein structure. The amino acids in these chains interact with each other through hydrogen bonds within the chain, folding the protein into secondary structures such as helixes and sheets. In the end the total amount of interactions comprising of hydrophobic interactions, salt bridges, hydrogen bonds, and disulphide bonds within the amino acid chain will make the chain, and its secondary structures, fold into a tertiary structure that gives the protein its overall shape. The folded proteins can then form protein complexes by assembling into multimeric quaternary structures. Proteins can also be modified by posttranslational modifications, such as altering the length of the protein chain, adding functional groups or otherwise changing the structure of the protein. The finished proteins then interact with each other both within and between cells. These protein-protein interactions then provide physical structure for the cells, performs enzymatic reactions, and relays signals within and between cells.
3
From DNA to Proteomes
Proteome The proteome is the entire set of proteins expressed by a cell or organism at a specific time and place. The origin of the word is a portmanteau of protein and genome, where genome has its origin in the German word ‘genom’ which in turn originates from the Greek word ‘gene’, meaning ’creation’ or ‘birth’. The word ‘proteome’ has been derived as such on the basis of the proteins being expressed by the genome in the organism6, and the human genome comprise approximately 20.000 genes that are annotated to encode for proteins.7 Most cells are believed to express a large fraction of the protein-coding genes8 and significant correlation between levels of transcription and expression have been observed.9 However, what differentiates most organs and tissues in the body is that their cells differ from each other on their levels of expression and post-transcriptional modifications of proteins. These differences of expression levels and modifications is what gives otherwise similar cells different phenotype and function, and this means that the proteome differs depending on the local milieu. Accordingly, there can exist a multitude of proteomes in the same organism, even though every cell in the organism has the same genome. This property of protein expression and modification makes the study of proteomes of great interest in the field of life-sciences.
The Complexity of Proteomes Many interesting applications of protein research involve measuring protein levels in complex biofluids such as blood or tissue lysates. Samples derived from blood, i.e. serum and plasma, are of high interest to life science researchers due to the fact that it is readily available for sampling and comes into direct contact with almost all of the body. Any changes that occur anywhere in the body should then be detectable in the circulating blood as a change in the blood-proteome. These changes can be difficult to detect since proteins that have leaked out of cells, due to disease or other damage, might be present in only a small fraction in the blood. Finding and quantifying these interesting, but low abundant proteins in complex biological samples can be a challenging task, since the protein concentration might range over ten orders of magnitude.10 This complexity and large variation in abundance of proteins can cause interesting proteins to become masked from detection by the noise generated by high-abundant protein species. 4
Affinity Reagents
One way of combating this complication of a complex proteome is to deplete complex samples of the most prevalent proteins, or perform separation or fractionation of the constituent proteins. This lowers the complexity and hopefully enables analysis of the low abundant proteins species. Although this might seem like a suitable solution for single samples, it can become labour intensive and impractical when sample collections in hundreds. An even more problematic outcome that has to be considered is that it could introduce bias in the final representation of the protein content, as has been shown for depletion.11 Extensive sample preparation should therefore preferably be avoided if possible.
Affinity Reagents Affinity reagents are compounds that bind specific substances and that are used to detect or affect their specific targets in biochemical contexts. They can be derived from various sources and comprise several different classes of molecules. The most commonly type of affinity reagents are antibodies, but antibody-derived affinity reagents such as single-chain variable fragments and Fab-fragments have also been developed and are of growing use. Other types of affinity reagents can be based on other classes of proteins and peptides, or even oligonucleotides. The type of backbone used for a reagent can greatly affect its chemical and biological properties and it should thus be possible to design and produce reagents that fits specific purposes. Even though a multitude of affinity reagents are available truly wellvalidated affinity reagents can be hard to come by12. An ‘antigen’ is a substance that is capable of binding to antibodies produced by the immune system in antibody-producing organisms. The word itself is an abbreviation of “antibody generator” as some antigens can provoke an immune response that causes production of antibodies. Although the origin of antigens are related to antibody production by a immune system, the word is today used as a general catch-all for substances that are used as targets for generating affinity reagents, regardless of whether those reagents are antibodies. The production of antigens is often performed in cell cultures and often involves expressing the protein as a fusion with another protein to aid in purification and recovery.13 This is not always successful as the protein of interest may be insoluble or toxic to the host, causing incomplete expression. To simplify the expression step, protein fragments is sometimes 5
Affinity Reagents
chosen to represent the full-length protein instead. However, properly folded full-length proteins have been shown to result in a higher degree of conformation-specific reagents, 14 which can be desirable if the target protein retains its native conformation during the assay.
Antibodies The concept of antibodies dates back to the beginning of humoral theory of immunity in 1890, and still many think primarily of antibodies when they think of the concept of affinity reagents. Antibodies are a class of protein commonly called immunoglobulins (Ig for short), and are part of the immune system in all jawed vertebrates15. Antibodies are produced in B-cells and are roughly Y-shaped molecules that are comprised of two functional parts. One part on each ‘arm’ that recognises and binds to antigens, and a second part that is responsible for binding to effector molecules and receptors present on other cells in the immune system. These two parts of the antibody are generally referred to as the ‘Fab-region’ and the ‘Fc-region’.
Figure 1: An antibody and its regions. “VH” and “VL” are variable regions of the heavy and light chains. These, here shown as blue modules, will together with the constant region of the light chain (“CL”, light grey module) and the first constant region of the heavy chain (“CH1”, top dark grey module) build-up the antigen binding part of the antibody, or the “Fab-region”. The second and third constant chains of the heavy region, the four lower dark grey regions, will form the “Fc-region”, which activates biological functions. These two regions, the recognition and the activation regions, are connected through the hinge region, here shown in brown. Variations in the hinge region and the Fc-region generates different antibody subtypes.
6
Affinity Reagents
Both the Fab-region and the Fc-region are comprised of products from multiple gene segments and some of these gene segments are highly variable on the gene level. As the Bcells matures from precursor cells the variable regions will rearrange to produce new versions of these variable gene segments. The protein products of these variable gene segments are at expression combined with the protein products of constant gene segments to form the finished antibodies. These variable regions form the epitope binding parts on the Fab-region of the antibody and the genetic rearrangement results in the generation of B-cells that produce antibodies that recognizes different antigens16. The variable regions of the antibody thus ensures that the antibodies can recognise the wide variety of possible antigens that is needed to be detected for effective immunity. Antibodies commonly do not recognize the whole antigen but only a small part of it, a so called epitope. These epitopes can be linear stretches of the antigen, or conformational structural epitopes consisting of different parts of the antigen that are close to each other in the threedimensional structure of the antigen. This flexibility in epitope recognition further increases the recognition possibilities of antibodies, and while the variable Fab-regions rearranges during maturation to recognize different antigens the Fc-regions remains constant allowing for the effector functions to be retained. However, by removal of parts of the constant gene segments during cell division different versions of the Fc-regions can be produced by different B-cell clones while retaining the same Fc-region. This gives the immune system a certain degree of modularity in the combination between antigen recognition and effector function, and the combination of different effector functions to the same antigen recognition results in different antibody ‘isotypes’.
Antibody Isotypes Different effector functions for antibodies with the same epitope recognition is achieved by combining the Fab-region with a few different Fc-regions that carry different effector functions. These combinations are referred to as isotypes, of which there are five main classes, IgM, IgG, IgE, IgA, and IgD, and all of these fulfil different functions in the immune system17. IgM can be found in all vertebrate species, and as such is the evolutionary oldest of the five main classes of antibodies. It is the first antibody isotype to be produced during immune
7
Affinity Reagents
response, and it can be found as a dimer on the outside of B–cells, or secreted as a pentamer into the circulatory system. In its membrane-bound form it regulates B-cell response and survival as a B-cell receptor while secreted IgM can be found as both natural IgM and immune IgM. Natural IgM is constantly produced, and is mainly responsible for triggering clearance of apoptotic cells. Immune IgM is produced as a response to pathogen exposure and can, upon recognition of a foreign invader, activate a part of the immune system called the complement system to kill the invader. The antigens for secreted IgM are often lipids or polysaccharides and often show broad specificity with low affinity, but due to their pentameric structure, high valency18, 19. IgG is the most abundant immunoglobulin in the blood circulatory system, and appears after multiple antigen challenges have led to maturation of the antibody response. It is secreted as a monomer and can be divided into four subclasses due to small variations in the Fc-region. These subclasses are called IgG1, IgG2, IgG3, and IgG4, and all differ in the flexibility of their hinge region, thus giving them different effector functions. IgG induces activation of the complement system, and activates phagocytes that ingests foreign or harmful material20. IgE binds to receptors on the outside of cells in the immune system, and can upon antigen recognition stimulate these cells to release inflammatory mediators. IgE mostly confers immunity towards parasites, but also plays a big role in allergic reactions21. IgA is found mostly in the mucosa where it functions as a first line defence in the protection from bacteria and virus, but can also be found secreted in blood, and similarly to IgG it can be found as two subclasses22, 23. IgD is mostly found on the surface of B–cells, where it is co-expressed with IgM, and is believed to control B-cell activation and suppression 24.
Polyclonal Antibodies B-cell clones with different versions of the variable region will bind different epitopes on the same antigen which results in pools of antibodies with different clonality25. Collections of these antibodies that originate from different B-cell clones are therefore called ‘polyclonal antibodies’. Polyclonal antibodies are a common reagent used for affinity proteomics, and are usually produced by immunizing a host animal with the antigen of interest, causing the immune system of the host to produce antibodies against the antigen25. The strength of polyclonal antibodies is their ability to recognize many different epitopes of the same antigen, 8
Affinity Reagents
both linear and structural26. This effectively means that multiple antibodies can bind to the same antigen which increases the probability of detecting the antigen. This has been shown to lower the limit of detection and to make them less sensitive to masking of epitopes, thus making them useful in more applications.25, 27 As each host produces their own pool of polyclonal antibody batch to batch variation is introduced, since the pool of antibodies extracted from one host animal is not exactly the same as the pool extracted from another host animal. This means that the renewability of polyclonal antibodies is limited and they are therefore usually used in vitro and not in vivo due to the difficulties in determining the properties of changing pools of antibodies with different clonalities.
Monoclonal Antibodies Monoclonal antibodies originate from the same B-cell clone, and consequently bind the same epitope of the same antigen. They are produced in hybridoma cells, a fused hybrid of spleen cells from a host animal immunized with the antigen of interest, and myeloma cells 28, 29. The result of the fusion is an immortal cell line that produces identical antibodies, and as long as the cell line is kept alive there is a constantly renewable source for antibodies with the same singular clonality. Monoclonal antibodies are labour-intensive and expensive to develop but, once established, can be renewed indefinitely. The monoclonality makes it easier to determine their epitope binding properties, as only one epitope has to be characterised. However, since they are dependent on the accessibility of one single epitope they may also have limited functionality and not work in certain applications if that specific epitope becomes masked or otherwise inaccessible.
Antibody Derivatives The modularity of antibodies provides the basis of affinity binders derived from chosen segments of full-sized antibodies. Instead of producing whole antibodies, fragments of antibodies, can be produced and utilised as affinity reagents. The two most common variants are ‘fragment antigen binding’ (Fab) fragment and the ‘single-chain variable’ (ScFv) fragment. The Fab fragments consists of the Fab-region of the antibody without the Fc-region, while ScFv is only the variable region of antibodies. These smaller parts of full size antibodies can be 9
Affinity Reagents
recombinantly produced in large bacterial or phage libraries through random recombination of the genes, which makes them cheaper to produce than monoclonal antibodies. Their small size potentially allow them to penetrate deeper into tissue, and their recombinant production enables improved affinity and possibility to fuse them to other molecules30, 31.
Other Affinity Reagents The quest for designed affinity binders does not stop with the antibody derived fragments and there are several affinity molecules based around other protein moieties or DNA/RNA molecules. These are often referred to as scaffold binders or small-molecule binders. Scaffold binders are based on stable molecular scaffolds that has an integrated affinity function that can be varied through combinatorial engineering to produce different binding characteristics. 32
An example of a scaffold binder is the Affibody molecule, which is based on protein A that
comprises a three helix bundle where the amino acids in two of the helixes can be changed by random mutation in combinatorial libraries or by directed engineering to produce different binding characteristics.33 Other examples of scaffold-based or small-molecule binders are Anticalins (derived from lipocalins)34, Adnectins (derived from fibronectins)35, DARPins (ankyrin repeat proteins)36 and Aptamers (oligonucleotides)37. These scaffolds and small molecules can be engineered to be stable under environmental conditions that would not be suitable for full-length antibodies and similarly to antibody fragments they can also be selected to high affinity and combined with other molecules.
10
Proteomics
Proteomics The large-scale study of a field in biology is often referred to as –omics, such as proteomics for proteome-wide studies of the entire dynamic library of proteins, and all their modifications, localization, interactions and expression levels. The proteome-wide mapping of the entire set of proteins in all their variants, which may number in the hundreds of thousands with all possible combinations, represents a unique challenge38. The fields of genomics and transcriptomics have been experiencing a rapid growth in discoveries and research after mass-producing copies of a DNA strands became achievable through PCR. Consequently, the ease of access to an almost unlimited number of copies laid the foundation for the enormous success that the field of genomics has achieved so far. However, for proteomics there is not any equivalent to PCR, and the production of proteins often becomes a bottleneck due to this. Each protein needs to be recombinantly produced or synthesized as short peptides and the methods available for this are often complicated and entail many steps in which the production or purification can fail. Common problems during recombinant protein production are failure in cloning of vectors, antigens are toxic to the cells, and incorrect folding or posttranslational modifications. This means that the production of proteins can be both costly and time-consuming and these limitations represent an impressive challenge if studies are to be performed on true proteome-wide scale.
Affinity Proteomics Affinity proteomics has become an important tool for studying the expression, localization, functions, and interactions of proteins using large-scale and high-throughput methods. This is owed to the possibility that is offered by affinity reagents to analyse complex proteomes without the need for extensive sample workup12, 39. By employing affinity proteomics, an increased understanding on how cells and organisms function can be reached, and new drug targets or diagnostic markers for diseases can be discovered. Many different affinity-based proteomic methods for achieving this exist, and some classical affinity-based methods are ELISA and Western blot. Although originally considered low-throughput, an increased automation of laboratory work have led to higher throughput while using these classical
11
Proteomics
methods12. However, this increase in automation still do not allow for the throughput necessary to perform truly large-scale affinity proteomics and therefore large-scale affinity proteomics methods are today dominated by various microarray formats such as tissue microarrays40, protein microarrays41, bead arrays42 and lysate arrays43. One limitation of affinity proteomics is in the access to well-characterised affinity reagents, and this negatively effects how efficiently the information in the proteome can be mined. In order to address this limitation, and to enable researchers to investigate more of the proteome, several projects aiming to produce affinity reagents have been started14, 44-46. The lack of characterisation generally expresses itself as a problem through off-target interactions, which is when an affinity reagent interacts with more than the intended target. These offtarget interactions generally increase in numbers as the complexity of the sample increase, due to the increase in possible interaction partners. An affinity reagent that have been produced to be specific for a certain antigen might very well interact with other available antigens, especially if they are available in higher concentration then the intended target. The severity of this is easily understood if one considers a sandwich assay. A sandwich assay comprises a capture molecule, a target molecule, and a detection molecule, and in the ideal case the capture and detection molecules will not have the same off-target interactions. For a binding to be detected in such an assay both the capture molecule and the detection molecule must have found the target, and off-target interactions will thus not be detected. However, off-target interactions can occur between different detection molecules as well as between detection molecules and captured molecules if more than one pair of capture and detection molecules are used, as an addition to the possible off-target interactions between capture molecules and target molecules. Such an assay, of N targets, will have almost 4N2 possible interactions between capture, target, and detection molecules. For a 100-plex assay this will then result in almost 40 000 different theoretical interaction pairs47. This complication can to some extent be mitigated by employing single-binder assays, where the whole sample is labelled, thus forgoing the need of a detection antibody. However, off-target interactions can still be expected, especially when using polyclonal antibodies48. These off-target interactions when multiplexing consequently becomes a problem that has been limiting some areas in the field of affinity proteomics49-51.There is therefore a great need for well-
12
Proteomics
characterised affinity reagents that have been well validated for specificity in each assay format that they are intended to be used in12.
Human Protein Atlas One of the largest undertakings to generate affinity reagents for proteomic research has been the Human Protein Atlas52 (HPA). The goal of the Human Protein Atlas has been to generate high-quality antibodies towards all human proteins for a systematic exploration of the human proteome. This is being achieved by employing high-throughput production of antibodies against proteins derived from the approximately 20.000 protein encoding genes in the human genome. These antibodies have been used to study the protein profiles of human tissues and cells, as well as different body fluids53, 54 and have also been applied to non-human tissues55. In order to achieve highly specific antibodies with preferably no off-target interactions a strategy of using as unique part as possible of the targets for the generation of reagents have been employed. This entails choosing protein fragments so that they represent a region of low homology with less than 60% sequence identity similarity to any other human protein. Transmembrane regions and signal peptides are also excluded, and no more than eight identical amino acids within any ten amino acid stretch as compared to any other human protein are allowed. The length of these fragments are on average around 80 amino acids, which allows for generation of antibodies towards multiple possible epitopes. By targeting a unique part each protein in this way it is possible to reduce the promiscuity of the antibodies generated towards the protein, while retaining the flexibility of polyclonal antibodies and ensuring that the antibodies will be applicable in to many different assays. In HPA, the protein fragments are fused with a His6ABP-tag containing a six histidine part and an albumin binding protein part. The histidine-tag allows for purification and detection with tag-specific antibodies, and the albumin binding part increases the half-life of the antigen during immunization. To obtain the polyclonal antibodies from the sera of the immunised animals a two-step purification is used. First a depletion of tag-specific antibodies is performed using affinity columns with the tag is immobilised. This is then is followed by purification of the fragment-specific antibodies on a column with the protein fragment immobilised56-59. By
13
Proteomics
employing this dual affinity purification antibody-antigen pairs that can be utilised on a multitude of different proteomic platforms within the HPA is obtained.
Figure 2: Distribution of fragment lengths. The distribution of fragment length of protein fragments produced as antigens for antibody generation within the Human Protein Atlas. The protein fragments have a mean length of 81, and a median length of 80 amino acids, and varies between 16 and 202 amino acids in length. The average weight of the protein fragments are 27 kDa, excluding the 18 kDa His6ABP affinity tag.
Currently the Human Protein Atlas contains a tissue atlas with immunohistochemical images and information on human genes on both protein and mRNA level for all mayor organs. It also comprises a cancer atlas for the most common forms of cancer, a cell line atlas with expression profiles for a number of human cell lines, and a subcellular atlas with immunofluorescence images for three cell lines52.
14
Protein Microarrays
Protein Microarrays The first result of “microspot” in the PubMed database is from an article in Nature by Joseph G. Feinberg that was published in 1961, “A ‘Microspot’ Test for Antigens and Antibodies”. In this article a microscope cover-glass was coated with antiserum-agar, and microspots of antigens were deposited using capillary tubes for detection of antigen-antibody precipitates60. However, the birth of the first true microarray is usually dated to the work of Patrick O. Brown et al. that was published in 1995 where they arrayed a set of 48 cDNAs from Arabidopsis thaliana61. The concept of microarrays builds on the concept of the “ambient analyte immunoassay” as described by Roger P. Ekins in 198962, and which effect on microarrays is summarized in a review by Thomas O. Joos et. al63. In short, using a high concentration of capture reagent in a small area results in highest possible local signal density, without affecting the local concentration of analyte to any considerable degree. This means that the amount of analyte captured directly reflects the concentration of the analyte in the sample, as the measured signal is only dependent on the analyte concentration and the affinity constant of the capture reagent. One major advantage of this is that the assay becomes independent of sample volume, as long as there is enough analyte molecules present for the concentration to not be affected in any significant way by the capture consumption. However, if the concentration of the analyte is low while the affinity of the binder is high, equilibrium may take unacceptable long due to constraints in mass transport64. The typical protein microarray entail a first step of immobilising a protein, or a mix of proteins, on a solid phase substrate. This is then followed by a primary interaction step where a liquid phase that contains an interaction partner to one of the immobilised proteins is added. A second interaction step is then often employed for detecting the primary interaction reagent, before a reporting step is performed to quantify the signal from the reporter molecule.
15
Protein Microarrays
Figure 3: Molecular complex for detection of a target protein. Molecular complex of a mixed sample attached to a solid phase substrate, containing a protein that is being recognised by a primary detection reagent, in this case an antibody. The primary detection antibody is then recognised by a secondary detection reagent that carries a reporter molecule. This is a basic protein microarrays setup and the generation of this complex follows the steps of arraying on the substrate, adding the primary reagent, adding the secondary reagent, and finally detecting the reporter molecule.
Planar microarrays are manufactured by spotting small volumes of an analyte within a small spatial area on a substrate such as a microscope slide to form grids of what is generally referred to as ‘spots’ or ‘features’. These volumes are usually in the pico to nanoliter range, and the distance between features and their diameter are usually in the micrometre range. The low volumes, and the accompanying small spatial area that are needed for each feature, means that large numbers of analytes can be arrayed and simultaneously analysed on one single slide. This has predominantly been used for the analysis of gene expression by arraying single stranded DNA or synthesized oligonucleotides and applying samples consisting of labelled complementary single stranded DNA. However, the technique has also become common for use in protein analysis65. The final result is often a false colour image of the array with one or more colours representing different detection molecules, as can be seen in figure 4.
16
Protein Microarrays
Figure 4: A false-colour image of a protein microarray with 384 different proteins. Each feature in the array contains a unique protein identity and is visualised using detection reagents specific for a small part of the protein that is common for all unique protein sin the array. This results in a green signal showing the presence of proteins in each feature and aid in quality control and image analysis. An antibody targeting one of the proteins have been added and can be detected as a red signal in one of the features, in this case in the first column and third row of features.
Protein Microarray Formats Not only protein-protein interactions can be analysed on protein microarrays. Protein interactions with other possible interaction entities such as small molecules, liposomes and carbohydrates, can also be investigated. The complexity of the analytes can vary depending on if they are crude biological samples or pure reagents, and the multiplexing can lie on either detecting many antigens in a few sample or few antigens in many samples. Protein microarrays can consequently be used for a large number of applications and reagents. It also means that the world of protein microarrays is a multidimensional world, which looks different depending on the dimension that it is being projected on. Due to the variation in possible microarray assays, variations in how to classify different protein microarrays assays exists. However, the two main choices are to either refer to them based on what the goal of the assay is or which phase that carries the analyte. The assays can either be analytical or functional in their nature. The goal for analytical assays is to measure the concentration or existence of an antigen in a sample, and to measure some biological activity for functional assays. Therefore, one way of classifying protein microarrays is to refer to them as either analytical protein microarrays or functional protein microarrays65-68. The other way of classifying them is by referring to which phase, the solid substrate phase or the liquid phase that the analyte is presented in. If the analyte is in the liquid phase of the assay it is regarded 17
Protein Microarrays
as a forward phase protein microarray and conversely as a reverse phase protein microarray if the analyte is immobilized on the substrates solid phase69, 70. I will here refer to protein microarrays as being either antigen arrays, where proteins are arrayed either to be detected by affinity reagents targeted towards them or be investigated for functional properties, or capture arrays, where affinity reagents targeting analytes are arrayed. Alternatively I will refer to them as reverse phase arrays, which are then a more complex case of the antigen array. This choice of nomenclature is due to the fact that capture arrays are dependent on maintaining the structural integrity of the capture molecules in order to preserve their function, while conformational integrity is not always necessary in antigen arrays. Finally, the reverse phase arrays are unique in the complexity of the arrayed analytes and as such represents a unique set of challenges.
Antigen Arrays Antigen arrays can be used for a multitude of applications and assays due to the many variations of assay that is possible. One early example is Michael Snyder et al. who generated an array containing purified recombinantly produced gene-products from more than 90 % of the S. cerevisiae genome, and investigated protein-protein interaction for calmodulin71. By doing this they became the first group to show the power of antigen arrays in screening for protein-protein interactions. Antigen arrays have since then been utilized for studying posttranslational modifications and binding interactions in a multitude of applications72. Antigen arrays can also be used for characterisation of antibodies by assessing their specificity, accuracy, precision, and detection limit as performed by Brown et al73. Similarly this can also be performed in a high throughput manner to validate the on-target interaction and profile off-target interactions of affinity reagents74. Antigen arrays have also with great success been applied for identification of targets for autoantibodies produced in autoimmune diseases and immunodeficiency75. In this case the antigens, either known antigens for quantification of autoantibodies or possible novel targets, are arrayed on a solid substrate and a biofluid applied and analysed for its content of immunoglobulins76, 77. Similarly the immune response towards pathogens can be measured by arraying antigens and detecting the presence of antigen-specific immunoglobulins78.
18
Protein Microarrays
Production of peptide arrays using photolithography allows for ultra-high-density arrays, which can cover the whole proteome even when overlapping peptides are used. One example of this is the epitope mapping performed on ultra-dense peptide arrays comprising 2.1 million 6-mer -overlapping 12-mer peptides, which cover all of the human proteins. These arrays have been successfully used for mapping the contribution of each amino acid to the binding of both monoclonal and polyclonal antibodies, and to investigate specificity and cross-reactivity across the whole epitome79. While antigen arrays often are synonymous with protein or peptide arrays exceptions exists, and one such that should be mentioned are the carbohydrate arrays. The study of carbohydrates in cellular biology is a field of high interest to many research groups as carbohydrates as biomolecules can be found both inside and outside of cells and fulfil various biomolecular processes. However, due to the very large variety in branching and molecular modifications of carbohydrates in biological systems, the cellular glycome is believed to be up to 500.000 different structures. This large number of possible variations has made microarrays an important technology for rapid analysis of carbohydrate-protein interactions80. The production of carbohydrates for arrays are either done through chemical synthesis or through natural sources. If the carbohydrates are large enough non-covalent adsorption through electrostatic or hydrophobic interactions to a substrate is possible. However, if the carbohydrates are small the resulting forces might be too weak to retain the carbohydrates on the substrate during washing steps, and covalent attachment becomes necessary81, 82. Carbohydrate arrays have been extensively used to investigate the carbohydrate biology and its involvement in diseases, infection and cell-signalling80.
Capture Arrays The variation in microarray capture assays, and the choice of technical approaches in labelling, amplification, substrates and detection is large83. The main difference between antigen arrays and capture arrays is that for capture arrays the arrayed capture molecules are generally similar with active sites that need to accessible. In practise that means that immobilising capture molecules, such as antibodies or Fab fragments, in an oriented approach with the binding sites exposed to the liquid phase can result in up to ten-fold higher binding capacity of the analyte in the samples, as compared to random immobilization84. The commonly used 19
Protein Microarrays
method to achieve oriented immobilisation with antibodies is the use of an intermediate affinity protein, such as protein A or protein G, or to chemical modify the Fc-region84. Random immobilisation of antibodies have been shown to cause conformational changes and loss of function of the majority of the proteins85. This change in conformation could be assumed to affect other types of proteins as well and therefore more emphasis has to be placed on orientation of the arrayed proteins if the functionality of the proteins is important for the assay. Further improvement to the retention of functionality can be seen in hydrogel-based substrates, which often results in higher signal-to-noise ratios than flat 2D-surfaces86, 87, due to higher binding capacity and improved hydration. An even further improvement of the format is the bead-based assay42 that is performed in a liquid phase, thus maintaining the hydration of the molecules and circumventing the loss of function due to denaturation. While the common sandwich-based assay offers high sensitivity and specificity due to the dual binding needed for readout, the sandwich-format is generally less compatible with multiplexed array-assays, due to previously mentioned problem of off-target interactions from detection reagents. The alternatives to using a sandwich assay is to label the whole sample, and compare samples through included controls or a common reference, or do direct comparison between samples. The reference design can be set up through a pool of equal aliquots of all samples to be tested that is labelled using a different tag than the samples. This pool can then be used as an internal normalization standard, similar to the commonly used reference mix used for DNA microarray experiments11, 73, 86. This might seem as an attractive approach for comparing different samples against each other as the relative abundance of targets should be comparable through the reference pool. However, the sensitivity of such an assay can be seriously hampered by competition for capture molecule binding sites that can occur between sample and reference. This has led to widespread preference for the singlecolour approach where each sample is labelled with the same label, and compared directly or through controls88. Although somewhat complex to set up, the capture arrays have become an indispensable tool for biomarker discovery and have been applied in numerous disease biomarker discovery platforms89.
20
Protein Microarrays
Reverse Phase Arrays In a reverse phase assay the analytes are immobilized on the substrate and capture reagents are applied in the solution phase. This approach allows for simultaneous analysis of up to tens of thousands of analytes or samples but is sensitive to off-target interactions from detection reagents. The reverse phase format has been extensively utilized for analysis of complex samples such as tissue lysates90 and cell lines43, as well as for body fluids such as serum91 and cerebral spinal fluid92. The goal of the analysis have often been the study of biomolecular network pathways, but also for large-scale screening. Due to the complex nature of the analytes or samples, the standard for reverse phase arrays is to array each sample in triplicates with at least two different dilutions93. This increases the likelyhood that at least one dilution will fall within the linear range of the detection method, even though the samples might have variable concentration of the target molecule. However, this also limits the actual number of samples that can be processed in one array, pushing it from thousands to hundreds, as replicates and controls occupies spatial space. Arraying and detecting targets in complex mixtures presents more challenges than samples that contain only one or a few different proteins. The starting concentration of the samples might be high and low-abundant target proteins might become masked by protein species of higher abundance. The small volumes that are arrayed can result in few target molecules being present in the feature if the starting concentration in the sample of the target is low. For the lower abundant proteins the statistical chance of having the protein present in the spotted volume consequently approaches zero even if no sample dilution is performed during preparation. Low-abundant proteins might as a result be represented by only a few molecules in each feature together with the complex mix of proteins of higher abundance. This results in it becoming difficult to detect them among the unspecific background from off-target interactions of the detection molecules, or by autofluorescence of the substrate. Another challenge is that most substrates do not have the loading capacity to effectively retain all proteins in a complex sample such as serum or plasma. The excess material can result in bleedoff of unbound material from the feature, which can cause an elevated background in the vicinity of the feature94. Consequently, reverse phase assays often use high-capacity substrates such as nitrocellulose95. The off-target interactions of detection molecules have to 21
Technical Considerations for Protein Microarrays
be addressed by choosing the detection molecules that have gone through extensive validation as to their specificity and target-recognition, and autofluorescence can be minimised by choosing a detection wavelength that does not overlap with the autofluorescence of the substrate. The use of near-infrared scanners have therefore become of prominent use in reverse phase array assays, owing to the fact the excitation and detection wavelengths lies beyond that of autofluorescence of the popular nitrocellulose substrates96.
Technical Considerations for Protein Microarrays Microarrays present a unique set of challenges due to the very minute volumes that are being deposited and the small spatial scale of the features. Variables such as humidity and temperature become important aspects when arraying such small amounts of liquid, and must be taken into consideration to ensure that features are properly arrayed. The concentration of arrayed proteins, buffer composition, and drying of the array greatly affects the morphology and linear range of individual features. The better the alignment and morphology of features can be controlled, the tighter they can be packed. A tighter packing increases the achievable array density, which in turn increases the number of tests that can be carried out simultaneously. Visualisation and quantification of the amount of protein immobilised in each feature is often of great interest, both as a quality control and for normalisation purposes. This can be achieved either by using a detection reagent that detects protein in general on one array, and extrapolating the obtained results to other arrays generated during the same production run. Another method is to produce the proteins as fusions to a tag that is detectable using a tag-specific reagent. The tag can then be used to visualise or quantify the total target content for each individual feature simultaneously as the analyte-specific test through a dual-color approach. However, since a common tag might cause cross-reactivity consideration must be made as to possible unwanted interactions towards the tag that might result in increased background. Instruments utilising different technologies for production of microarrays have been available on the market or presented in research articles. However, the principles used by the most common of these instruments can be roughly divided into the two categories of contact printing and non-contact printing. Contact printing involves making physical contact between the substrate and pins so that material is transferred from the pin to the substrate. Non22
Technical Considerations for Protein Microarrays
contact printing includes a diverse group of systems where some utilises inkjet technology to eject droplets through the air onto the substrate and other systems use photochemistry to build up peptides on substrate surfaces. The liquid volumes that are deposited are often in the nanoliter range and should produce features that are morphological homogenous and simultaneously spatially discrete and high-density. A mentioned previously, this presents a considerable challenge, owing to the small volumes and the great impact of even minute changes in the ambient environment or sample composition.
Contact Printing The most common way of producing arrays on a flat substrate, such as a microscope slide, is by contact printing with pin-printing instruments. The reason for their widespread use is the initial adoption for DNA microarray-production, and that the experience gained from DNA microarrays were later initially applied to protein printing97. Feature uniformity in pin-printing is affected by the viscosity of the solution, the substrate surface properties such as planarity, and pin properties such as pin velocity and pin design. Viscosity is in turn affected by temperature, where higher temperature results in lower viscosity, which affects the volume deposited on the substrate. The change in volume ultimately impacts size and morphology of the arrayed features. Another factor that impacts feature size is the hydrophobicity of the substrate and solution, as deposited drops tend to increase or decrease their contact with the substrate depending on the hydrophobicity of the phases. Changing to a more hydrophobic buffer for the solution can subsequently result in a larger feature area on a hydrophobic substrate, and a more hydrophilic substrate can conversely lead to an increase in feature size97. The most common pins used in pin-printing are solid pins, split pins, and quill pins. The solid pins are loaded with solution by dipping the pin in the solution and allowing the capillary force to retain a small volume at the tip of the pin. Solid pins have a low loading capacity and are only useful for small scale arraying as a single loading can only array a few spots before being depleted. The strength of the solid pins is that they are not very sensitive to the viscosity of the solution, and consequently can be used for high viscosity solutions such as protein samples. They are also not especially sensitive to dust and debris and can be cleaned fairy 23
Technical Considerations for Protein Microarrays
easily98. A modification of the solid pin is the “pin and ring” system that uses a large capillary that is loaded with sample and a solid pin that passes through the meniscus of the capillary before making contact with the substrate, thus increasing the capacity and requiring less reloading of sample97. Split pins are a further development of the solid pins and have a microchannel in the tip that the solution is loaded into through capillary forces. They can accordingly produce more features before reloading of the solution becomes necessary due to the larger volume. A variation of the split pin is the quill pin which, together with the microchannel, contain a small reservoir for further increase in loading capacity. Although these split and quill pins allow for higher throughput they are also sensitive to dust and clogging, as well as dependent on lowviscosity solutions for successful deposition on the substrate. They are also dependent on high humidity being maintained during printing to prevent drying of the samples in the pin97, 99. When using split pins the feature size may also vary during arraying due to the changing volume in the tip. The initial features gets a large volume deposited due to draining of the excess solution on the outside of the pins, and become large or merge. The last features produced becomes small or might not be arrayed at all due to completely depleted pins100, 101. To prevent this, pre-printing on slides to drain the excess volume before continuing arraying is generally necessary with split and quill pins. Due to the variation in feature size care should be taken to keep track of the order of which the slides have been printed, to ensure that arrays that are being used for a single project are as similar as possible. These pins are usually made of metals such as stainless steel, tungsten, or titanium and the solid pins are relatively insensitive to damage. However, split pins and quill pins can easily be deformed if the pins touches the substrate with too much force97. A further limitation of pins is that proteins tend to adsorb to metal surfaces102, making metal pins difficult to clean. Taking it together, these limitations in the pin-based technology makes contact printing less desirable for producing protein microarrays. However, pin-based printers are still in extensive use for protein microarray production due to their legacy from the DNA microarray era.
24
Technical Considerations for Protein Microarrays
Non-contact Printing Non-contact printing involves no physical contact between the arraying tool and the substrate surface. Instead, droplets of the solution are contained in a reservoir and ejected through a nozzle at some distance above the surface. This can be achieved in different ways103, but the most common is by piezo actuation. In arraying-instruments that utilizes piezo actuation a deformation of a small reservoir holding the solution causes a pressure change in the reservoir, which in turn results in the ejection of a controlled amount of solution from a nozzle104. One advantage of non-contact printing is that it allows for samples of higher viscosity due to the use of nozzles. This makes it easier to array protein samples, which can be of high-viscosity and difficult to array using pin-based instruments. Another advantage is the lack physical contact with the surface of the substrate, as this removes the need to make the printhead travel in the z-axis. Consequently drops can be dispensed “on the fly”, greatly speeding up the deposition of droplets. In addition, this eliminates the risk of damaging the substrate due to physical impact, which is something that otherwise can happen to sensitive surfaced such as hydrogels and nitrocellulose. The comparably large reservoirs used in noncontact printing also means that more droplets can be deposited before reloading. This improves feature to feature reproducibility, and ensures that the first and last feature in a print run will have the same size. Although non-contact print nozzles commonly are manufactured from other materials then metals, adsorption of proteins can still occur. Additionally, deposition of salt from buffers and samples can affect the nozzles ability to dispense droplets. This might result in features that are missing or out of alignment, carry-over of sample between loadings, and small satellite spots105. A third way of producing protein arrays is by synthesizing the proteins or peptides directly on the substrate such as by photolithic synthesis. The principle of the technique is to combine a UV-light with photomasks or digital micromirrors that can be controlled to selectively activate small areas of the array to deprotect amino acids with photo-sensitive protective groups. These deprotected amino acids then become available for reaction with new amino acids carrying photo-sensitive groups, and thus addition of amino acids to specific parts of the array can be achieved in an iterative manner106. This can generate ultra-dense arrays of peptides 25
Technical Considerations for Protein Microarrays
with over 2 million unique peptides on a 2 cm2 area. Such an array would be enough to represent all human genome-derived proteins as 18-mer polypeptides with as much as 12mer overlap79, 107.
In Situ Synthesis Recombinant expression and purification of proteins can be expensive, time-consuming, and fraught with hurdles such as poor expression, protein degradation and aggregation67, 108. An alternative way is to express proteins directly on the substrate instead of performing conventional expression and purifiaction109. This is usually referred to as in situ synthesis, or cell-free protein array production. The two main methods for this are the “protein in situ array”108, 110 and the “nucleic acid programmable protein array”110, 111, or in short; the PISA and the NAPPA arrays. By generating the proteins of interest directly on the substrate, the problem with solubility and functionality can to a large extent be alleviated. For the PISA technology this is achieved by using the DNA sequence for the protein of interest, together with a N- or C-terminal hexahistidine tag that allow for binding to an immobilisation agent that is precoated on the substrate. These are then arrayed together with a cell-free expression system that contains all the essential elements for transcription and translation. The results is a coupled transcription and translation of the DNA, which is followed by immediate capture of the protein on the surface of the substrate. After removal of the expression system through washing only the captured, freshly synthesized proteins, should be bound to the substrate surface. In NAPPA the DNA encodes the proteins as GST-tag fusions and is arrayed together with a crosslinker, which immobilises the DNA on the substrate surface, and an anti-GST capture antibody. This produces an array with colocalised DNA and capture antibody, which can be stored and expressed on demand by covering the surface with a cell-free expression system. After removal of the expression system an array of proteins colocalised with their DNA and capture antibody remains112.
26
Technical Considerations for Protein Microarrays
Substrates Microarrays results in a very visual representation of the acquired data, through false-colour images of the slides and its features. These images can be difficult to interpret if the processes involved in the immobilization of proteins and formation of features is not fully understood by the analyst. One of the most typical problems occur due to nonuniformity of features in the array, where the nonuniformity usually takes on the shape of rings. That thin liquid films takes on ring-like appearances is well known, and the everyday example would be the coffee ring effect that occur when particle-laden liquid is spilled and allowed to evaporate. The general background to this phenomena is known to be the transport of material from the inside of the droplet to the interface between gas and liquid, and the further transport through capillary flow from the centre of the droplet to the pinned edges of the droplet. Protein molecules tend to accumulate at the gas/liquid interface due to the fact that the different hydrophobic and hydrophilic parts will have different preferences for being in either the gas-phase or the liquid-phase113. This results in accumulation of protein at the interface, which depletes the concentration within the droplet. The depletion then results in a lowering of the concentration of proteins available for binding at the substrate surface. Further diffusion of proteins along the interface to the boundary of the pinned edges of the droplet depletes the centre of the gas/liquid interface, lowering the concentration at the interface and resulting in further transport from the centre of the droplet114. In this way the concentration available for binding to the substrate is continuously lowered, decreasing the immobilisation efficiency115. This effect is evident regardless of whether the droplet is allowed to dry or not, but delaying the drying will give the proteins more time to immobilise. Consequently the use of liquids with low vapour-pressure, such as glycerol, can improve the uniformity of the features116. Another solution is to outcompete the accumulation of proteins at the gas/liquid interface by adding a surfactant117. This will displace the proteins from the interface and increase the concentration at the substrate surface, which improves immobilisation efficiency. Microarray technology for proteins were originally adapted from DNA microarrays, but proteins come with a completely different set of challenges owing to a greater variation in their chemical properties. Due to the large variation in size, structure, charge etc. proteins will
27
Technical Considerations for Protein Microarrays
vary in performance on different substrates. This affects the usefulness of a substrate for different assays and care must be taken when choosing the substrate. If the proteins are immobilised in such a way that they change their properties through conformational changes or that their active sites becomes inaccessible, then those proteins might not perform well if the assay depends on their conformation. However, if the assay is independent of the structure and orientation of the protein the immobilisation may have less importance for the final result. Many proteins are folded in their native state so that they display a hydrophilic exterior while the hydrophobic parts are folded toward the centre of the proteins. If the proteins then are immobilised on a hydrophobic substrate the result might be denaturation or even refolding of the proteins. This could even cause the hydrophobic parts of a protein to be displayed on the outside of the protein, while the hydrophilic parts folds inwards, inactivating the protein118. One example of an assay that is greatly improved by retaining the conformation and correctly orient the proteins is the antibody array, as previously mentioned. For antibody arrays the orientation of the antibody molecules can be arranged by using protein A, protein G or the recombinant protein A/G. These proteins bind to the Fc-region of antibodies and orient the Fab-regions away from the substrate surface and owing to the increased accessibility of the Fab-regions the amount of functional antibodies will be higher119. Substrates can be divided into many different groups depending on their properties but the two most easily understood is the division into different geometries and the division into different coupling chemistries. The immobilisation chemistries can in general be divided in to physical adsorption, affinity binding and covalent binding. When adsorption is used the proteins attach to the substrate using nonspecific interactions such as hydrophobic interactions, Van der Waals forces or electrostatic interactions. This immobilisation results in random orientation of the immobilised proteins and can result in denaturing or other structural changes, ultimately leading to loss of protein function. That means that those immobilisation strategies that relies on adsorption might be less optimal for assays that require functional proteins, such as the previously mentioned capture assays or functional assays. Another disadvantage of physical adsorption is that adsorbed proteins may desorb during assays, leading to loss of detection.
28
Technical Considerations for Protein Microarrays
An example of oriented affinity immobilising has already been given above in conjunction with the discussion on orientation of antibodies and some other examples are also worth mentioning. For instance, site specific biotinylation of proteins can be used to orient them on a streptavidin-coated surface, and histidine–tagged recombinant proteins can be oriented by arraying them on a substrate coated with metal complexes that bind to the histidine–tag. The third mayor strategy behind the immobilisation of proteins in microarrays are the covalent binding to reactive groups, such as aldehyde, epoxy, amino, or NHS ester (Nhydroxysuccinimide). These reactive groups bind through reaction with primary amine, thiol and hydroxyl groups. The covalent attachment prevents the proteins from being removed during washing but might cause them to denature, and subsequently care should be taken if the arrays will be used for functional assays120. As for geometries, the substrates can either have a two dimensional coating of a monolayer of reactive groups on a flat support, or a three-dimensional structure such as membranes or hydrogels. A substrate with a monolayer of reactive groups is comparatively easy to produce and due to the low intrinsic autofluorescence of the substrate a low background can be achieved. However, due to the limited geometry available for binding the amount of proteins that can be immobilised is limited. If the amount of protein available in a droplet exceeds the binding capacity of the surface the rest of the proteins will only be loosely associated with the immobilised layer. This can cause increase in local background and/or abnormalities as tailing of spots when the excess material is being spread out during blocking and washing stages. If a higher amount of protein is to be arrayed a substrate that can be loaded with more protein is of better use. This could be for instance a hydrogel-coated substrate, which comprises of polymers that form a coating a few nm thick when dry but can swell to up to a few hundred nm when hydrated121, 122. The hydrogel can then provide an environment that keeps the proteins hydrated and stabilised while the attachment of proteins can be tailored using reactive groups 123. Another substrate-coating that can accommodate even larger amounts of proteins is nitrocellulose. Nitrocellulose covered substrates provide a high loading-capacity in a microporous structure that can be around 10 µm thick. This nitrocellulose layer can accommodate a large amount of protein and as such is suitable if the amount of material to be immobilised in each feature is high. However, this increase in loading capacity also comes
29
Technical Considerations for Protein Microarrays
with a higher degree of autofluorescence, especially in the wavelengths commonly used for fluorescent labels, and may require specialised detection instruments.
Detection Principles In order to detect target proteins on a microarray an affinity molecule directed against the target protein is applied to the array, a so called ’primary affinity reagent’. This affinity molecule will then find and bind to its target. These affinity molecules can be labelled in such way that they can be detected directly after binding. However, usually a labelled secondary affinity reagent that is directed against the primary affinity reagent is applied after the binding of the primary reagent, and this is then followed by detection of the attached label. This way of detecting the interaction between an affinity reagent and its antigen is called label-based detection. The reason for the use of two affinity reagents is that it makes it possible to detect different primary affinity reagents without the need to do multiple labelling experiments, and the possibility to buy off-the-shelf commercially available secondary detection reagents. Another advantage is that using a secondary affinity reagent means that the label will not interfere with the binding site of the primary affinity reagent, if the binding site of the secondary reagent is known. It is also possible to do label-free detection of the affinity reagents where some innate quantity of the reagent itself is measured. This makes it possible to avoid many of the difficulties that are part of labelling molecules and detecting different kinds of labels. However, label-free assays comes with their own sets of limitations and problems124.
Label-based Assays Label-based arrays can be based on radioactive molecules, fluorescent molecules, nanoparticles, quantum dots or biological barcodes. However, the most commonly used method is to use a fluorescently labelled secondary affinity reagent to detect the primary affinity reagent. One example is to use a primary antibody raised in one animal and a labelled secondary antibody, directed against the Fc-fragment of antibodies from the same animal as the primary antibody, raised in a another animal. This then ensures that no interference with the variable Fab-region of the primary antibodies occurs.
30
Technical Considerations for Protein Microarrays
Dual colours enable detection of two targets at the same time. As previously mentioned a tag attached to a recombinant protein can be detected in one colour-channel, and serve as a confirmation of the presence of the protein on the array, while the detection of specific proteins is performed using the other colour-channel. Using more complex detection methods more colours can be detected on the same array, and subsequently a higher multiplexing can be achieved. Unfortunately the different fluorescent molecules tend to have emission spectra that overlap to some extent. What this means that the higher the level of multiplexing the more difficult it might become to find molecules that have good separation between their respective emission spectra. Consequently, an increase in background can be expected as different molecules adds to the measured signal due to spectral overlapping125, as well as the previously mentioned off-target interaction increase. The most commonly used fluorescent molecules are Cy5 and Cy3. These cyanine fluorophores have been known to be sensitive to photobleaching and quenching, leading to lowered fluorescent intensity and a bias in the representation of the dyes126, 127. An alternative is the Alexa dyes that show lower dye bias due to their higher resistance to both photobleaching and quenching, which should result in higher correlation between the two channels. However, conflicting results when comparing fluorescent molecules have been found127-129, indicating that the differences between these two popular dyes might be of minor extent. Other examples of fluorescent detection molecules are rhodamine, texas red, fluorescein, and quantum dots130, 131. One limitation with label-based detection is the amount of fluorescent molecules that can be attached to each detection molecule without adversely affecting its recognition capabilities. In order to increase this number, and thus achieve a higher signal and detect proteins of lower abundance, an amplification method such as rolling circle amplification can be applied. Rolling circle amplification is an enzymatic amplification approach using an oligonucleotide primer, a circular single-stranded DNA, DNA-polymerase and nucleotides. The primer is attached to an affinity molecule that bind to the desired antigen. The circular DNA hybridizes to the primer, and the DNA-polymerase starts replicating the DNA using the free nucleotides. This creates a long strand of single-stranded DNA complementary to the circular DNA. Primers coupled to a fluorophore, or other detection molecules, can then bind to the long single-stranded DNA132.
31
Technical Considerations for Protein Microarrays
This not only means that it is possible to attach multiple detection molecules to one affinity molecule to amplify the signal, but also that a higher degree of multiplexing can be achieved133.
Label-free Assays In label-free assays an inherent property of the target itself, such as the mass of the molecule, is measured. This is a desirable property of an assay as it ensures that the measured signal is a direct result of the target molecule. In comparison to using labelled detection molecules the label-free assay would not be subjected to off-target interactions towards other detection molecules, and not be dependent on laborious labelling of the molecules. The two most common label-free methods for measuring proteins are surface plasmon resonance (SPR) and mass spectrometry134, 135. The basic function of a mass spectrometer is to measure proteins or peptides by ionising and accelerating them to a common velocity and measuring their deflection in a magnetic field. The resulting deflection depends on their mass and their charge and the output can be compare to databases with known or theoretical output. This potentially allow for identification of proteins and their proteoforms but often requires preparation of samples through enzymatic digestion or protein separation to reach high sensitivity136. In SPR the change in angle of reflection that occurs as mass changes when proteins interact with each other on metal surfaces is used to determine when a binding event happens. By measuring this change over time full kinetic information of the binding event can be extracted. Although SPR has been used for detection of interaction between proteins for quite some time there have not been any SPR-platform developed that can be used in a true high-throughput microarray format. Some microarray applications of SPR have been produced, but the number of possible simultaneous analytes have usually remained limited to a few hundred135 although arrays of thousands of features are possible137.
32
Technical Considerations for Protein Microarrays
Blocking and Background One of the factors that has to be addressed when generating a protocol for any type of protein microarray is the issue of “background”. Background is caused by detected off-target signals by the sample, the detection reagents, the substrate, or the contents of the buffer used for dilution of either sample or targets. In order to address any kind of background the origin of the background and the causing agent must first be defined, since it can originate from a number of different sources.
Local background The local background is commonly considered to be the signal in the reporter channel that originates from the area immediately surrounding the feature, and there are multiple contributing factors to this local background. One possible contributor to local background are satellite spots, which are small spots surrounding the feature that originates from spraying due to incorrect viscosity when using non-contact printers. Another contributor is smearing or tails that originates from excess material being smeared from moist spots when liquid is applied to the array. A third contributor is autofluorescence form the substrate itself and the fourth is interactions between proteins in the liquid phase and the substrate surface surrounding the features. These different factors has to be addressed separately; satellite spots must be addressed by using the correct viscosity in the print buffer, smearing and tailing must be addressed by ensuring the features are properly dried before any liquid is applied, and autofluorescence must be addressed by using substrates and detection methods that are mutually compatible. However, the possibly most difficult local background to address is offtarget interactions between the substrate and the sample or detection reagents. On some substrates the local background can be addressed through chemical inactivation of the reactive groups by adding organic chemicals that deactivates the reactive groups. Alternatively, a well-defined protein, or mix of proteins, can be added and allowed to bind to the substrate to form a uniform layer94, 138. The goal of inactivating or blocking the substrate in this way is to reduce off-target interactions towards the substrate to weak interactions and to achieve a uniform spread. Some care should be applied when deciding on which strategy that will be used for blocking, especially if the inactivation is performed with a protein. It should preferably be a substance that can be assumed to have weak interactions with the 33
Technical Considerations for Protein Microarrays
contents of the type of sample that will be applied. As an example bovine serum albumin is often a suitable blocking protein if the sample that will be applied in the liquid phase contains antibodies that is to be detected. Since serum albumin is the most prominent protein in serum of mammals most antibodies could be expected to show low general reactivity towards serum albumin. Addition of bovine serum albumin results in a monolayer of protein forming on the substrate surface, and this then quenches unreacted groups and limits the interactions of other proteins in the liquid phase.139 Adding more than one protein, such as a mixture of bovine serum albumin and fat-free powdered milk or gelatine can further reduce the background, but might cause an increase in off-target binding to the substrate due to the increase in complexity of the blocking agent. This local background can become a problem if it is high or uneven, but it often does not affect the detection of on-target binding to any greater extent and adverse effects is limited to making it difficult for image analysis software to detect the boundaries of features. However, if the local background is high or non-uniform it can affect the foreground data and thus requiring that it is subtracted from the data. Most two-dimensional or hydrogel substrates will result in an off-target adsorption that is too low or uniform to adversely affect the resulting data, as long as the foreground signal is of acceptable quality. If that is the case then applying downstream background subtraction to the dataset can add, rather than remove noise, and care should be taken as to how adjustment for local background is performed. Thicker substrates such as nitrocellulose can often result in considerably higher local background due to higher autofluorescence from the substrate140. Applying blocking procedures for thicker substrates can result in lower local background also for these substrates. However, it is far more efficient to choose detection wavelengths that lie beyond the autofluorescence of the material, such as near-infrared detection for nitrocellulose as the detected signal then lies beyond the autofluorescence of the substrate.
Global Background So far only the background caused by the substrate has been taken into consideration, but a more difficult problem is often the background from off-target interactions between the proteins in the liquid phase and the features. This is originates from off-target interactions between proteins in the sample or the detection reagents and proteins in the features, and is commonly related to the concentration and complexity of the liquid phase. As an example Brown et al. found that this background fluorescence is dependent on the total amount and 34
Technical Considerations for Protein Microarrays
the complexity of the protein mixture in the liquid phase. Consequently they found that high overall concentration tends to reduce the precision in the measurements73. To address this they suggested fractionation of samples to reduce the liquid phase complexity. However, as mentioned previously depletion, separation or fractionation of samples are less desirable due to decreasing reproducibility11 and is not suitable for high-throughput analysis. Although it can be true that increased complexity increases both local and global background, addition of proteins to a complex sample can also reduce background signals. This is commonly utilised by adding the same blocking agent to the sample as were used for blocking the substrate. This blocks the off-target adsorption of the proteins in the liquid phase before it is added to the array. Analogous to that, by adding more proteins off-target interactions between proteins in the liquid phase and features can be blocked. One such example is the global background that arise from some serum samples when applied to an antigen array where the features carries a six-histidine group. This group can be found in the Epstein-Barr virus that is often found in humans, and many individuals can produce antibodies towards this peptide-group. By adding this six-histidine group to the serum sample most of the antibodies towards that group will interact with the six-histidine in solution rather than the six-histidine immobilized in the features. The same principle can be applied to detection reagents; if the detection reagent is a labelled goat antibody then adding un-labelled goat antibodies to the sample can prevent weak background interactions between sample and reporter. The global background can also to some extent be alleviated by diluting the sample more if the foreground is saturated. This will reduce the detected signals from both the on-target foreground and the off-target global background, and as the foreground enters the linear range of the detection method the global background will approach the lower detection limit. However, when the global background reaches the lower detection limit, any further dilution might only serve to increase noise in the global background, as the off-target background approaches the noise level that is introduced through other sources. This makes it difficult to assess if the background is due to off-target interactions or introduced trough other sources. This is illustrated in figure 5 where the relation between foreground and global background is moved between off-target noise in high antibody concentrations and noise from other sources for low antibody concentration.
35
Technical Considerations for Protein Microarrays
Figure 5: 3D plots of two antibody dilutions. The first plot shows the reported fluorescent signal intensity for six dilutions of two antibodies in separate subarrays. Their on-target interaction are denoted as red peaks, off-target interactions that are less or equal to two standard deviations of the global background are denoted as green peaks, and off-target interactions above two standard deviations are denoted as yellow peaks. Each dilution were performed in separate wells and the separation is represented by the white areas between the subarrays. The plot to the right shows the same data, except as relative to the on-target peak for each subarray.
Multiplexing When using microarrays, one of the choices that has to be made is if the degree of multiplexing should emphasise features or analytes. One single feature physically occupies a small area on a slide and hypothetically a much larger number of features should be possible to array then what is practically possible. This is due to technical challenges in arraying that can cause the distance between the features can become too small, with merging of features as a result. The distance needed between features depend on the precision of the instrument that is used for generating the arrays, the buffer composition of the features, hydrophobicity of the substrate, the lattice arrangement, temperature , humidity etc. Since not all of these variables can be controlled for, a distance that is large enough to compensate for these variables has to be chosen. As an example, features with a 100 µm diameter arrayed in a regular lattice might need a centre to centre distance of approximately 180 µm to prevent the features from merging. The smaller this distance between features are the greater the risk of features merging and the greater the distance is the smaller the risk becomes. By increasing the precision of arraying instruments, and by controlling the ambient conditions and sample viscosity, variation in placement can be reduced and packing-density of features can be increased. It might be assumed that the optimal choice is to maximise the useable area on each substrate and acquire the maximum amount of data-points per sample. This might be desirable if the goal is to perform and undirected approach for novel discoveries of protein interactions or 36
Technical Considerations for Protein Microarrays
biological markers for diseases, such as for autoimmunity or protein profiling. However, it comes at a great cost as it requires access to expendable reagents whose relevance might not be of great importance in the proposed experiment, as well as requiring the use of more substrate. The addition of more unique features do not necessarily result in a one to one increase in meaningful foreground signals, which results in diminishing returns on the cost incurred from acquiring or producing the arrays. The alternative to this undirected approach, is to use a planned directed approach. Employing a directed approach entails applying a hypothesis as to which targets that might be reactive in the specific experimental context and acquiring those targets for arraying. By this the spatial size of the arrays can be kept smaller, and higher throughput per substrate can be reach through the use of subarrays and compartmentalisation. The sparsity of data that can occur for very large arrays is shown in figure 6 and illustrates the diminishing returns that can be a result of an undirected approach. If the interactions shown in red in figure 6 would have been thought to have relevance in the disease setting, and a directed approach comprising of these proteins had been performed, the cost incurred for the analysis could have been minimised. This is unfortunately not always not possible and as such these large-scale arrays represent an important tool for discovering unexpected autoimmune interactions.
Figure 6: 3D representation of the all 21,120 values in one array. The data originates from autoimmunity profiling of a human serum sample. Any data below or equal two standard deviations is coloured in green while data points above two but below or equal to ten standard deviations is coloured in yellow. Data above ten standard deviations is coloured in red.
37
Technical Considerations for Protein Microarrays
The added benefit of analysing more analytes per substrate is that variation and cost per experiment can be reduced simultaneously. This change in throughput can be illustrated by considering that a session with an arraying instrument might produce 100 microscope slides with one array comprising 57.000 features, usable for in total 100 analytes. Alternatively, it can produce 100 slides with 24 arrays with 1.500 features for a total of 2.400 analytes. The throughput can thus be varied between features or analytes through compartmentalisation. Compartmentalisation can be achieved through the use of multiwell hybridisation cassettes or microscope slide holders and commercially available holders can be found with up to 192 compartments per slide. While such a large number of compartments allow for a large number of samples to be simultaneously analysed, the arrays will by necessity be very small. If larger arrays is needed holders suitable for automation could be a more effective choice. A holder for four microarray slides with 24 compartments per slide results in a 96-well standard format for implementation of high-throughput applications using standard pipetting robots. In the end the choice often stands between reaching high-throughput in the dimension of either analytes or features, and compensating for the lack of throughput in the other dimension through increased cost and time.
Image Analysis Detection of fluorescent labels is usually conducted in a microarray scanner that use lasers to excite the fluorescent labels and a detector to detect the emitted light. The output of the instrument are false-colour images from each channel that can be merged into one final image. A scan of a full microscope slide can comprise millions of data points, most of which are of little interest as they are derived from the empty slide around the features. The number of data points depend on the resolution of the scanner, which is often adjustable between 5, 10, 20, and 40 µm. At a resolution of 10 µm each data point will be represented by a 10x10 µm square, or “pixel”. As a result, at a resolution of 10 µm a circular feature with a diameter of 100 µm will be comprised of approximately 80 pixels, i.e. 80 measurements. If an increase in measurements is desired and the resolution is increased to 5 µm, then the measurements made in each feature will quadruplicate to over 300.
38
Technical Considerations for Protein Microarrays
Superficially it may seem like the best choice of resolution would be the highest resolution possible but this is not always the case. As the resolution doubles the number of pixels quadruplicates regardless of whether the pixels are of interest. This will then result in longer scan times and a quadruplicating of the size of the image files. For light use this might not matter much as the increase in time and file size for a few scans might not become much of a problem. However, in large-scale projects that produce large arrays and numerous images the time spent on scanning and transferring files, as well as the size of storage space for large image files, makes it impractical to use higher resolutions. From this follows that the resolution that should be chosen is dependent on the resolution that is needed for a particular application. For most microarray application the 10 µm resolution will be suitable since the feature diameter usually will be between 100 – 150 µm, meaning that there will be approximately 80 to 180 pixels per feature. However, some applications, such as arrays produced through lithography, can produce features at the nanometre scale and consequently benefit from, or demand, high-resolution scans. Due to the large overhead of pixels originating from outside the array or in between features, extraction of data is preceded by defining the areas occupied by each individual feature. Thankfully this is to large extent achieved through software that uses algorithms to identify the location of the features. The automatic identification is followed by manual input for those features that are misalignment, ill-defined, missing or otherwise need attention. Once the features have been localised the software will separate the values from the features from the values immediately surrounding the feature. The values can then be exported to a more easily handled text-file format. Simple mathematical calculations such as standard deviation for each feature can often be performed and manually or automatically added notes on feature quality can be added. Once the data have been extracted and saved as more easily addressable files the question of how to evaluate the data arises. Depending on the experiment, extensive statistics may have to be performed in order to correct for experimental variation and abnormalities. However, no “catch-all” method for analysis of protein microarrays data exists due to the diverse assay methodologies that are possible for protein microarrays. Reference designs similar to what has been used for DNA arrays have been successfully implemented for protein arrays141, 142.
39
Concluding Remarks
Although, one drawback with applying statistical methods developed for DNA arrays is that they often assume a normal distribution that is equally distributed in both up and down regulation of expression, and this is rarely the case for protein data. However, the basic steps of filtering bad features, performing background correction, and normalising within and between arrays are similar between DNA and protein arrays. An alternative to the reference design is to normalise for the amount of targets in each feature143, which is an attractive alternative if the proteins have been expressed as fusions to a common tag. A special case are the reverse phase arrays that often utilises dilution curves, and sometimes suffer from large spatial variations due to their thicker substrates. As a result some analysis methods tailored specifically for reverse phase arrays have been developed
144-146.
All in all, many protein
microarray assays require their own tailored strategy for data analysis depending on the type of assay.
Concluding Remarks So how does all this tie together? To an outside observer protein microarrays may seem like a simple solution to the problem of conducting fast and efficient analysis without using large volumes of precious samples or unsurmountable amounts of time. In reality the protein microarrays are a complex collection of different experimental methods whose common ground is the use of proteins in a “capture molecule – analyte molecule” setup, and the attachment of either of these molecules to a solid substrate in predetermined positions. As might have become apparent in the preceding text, this basic setup quickly branches out into multiple platforms that differ from each other in regards to arraying instruments, substrates, buffers, and detection methods. The final type of assay depends then on what kind of capture and analyte molecules that are being used and the capability of the laboratory performing the analysis. The final assays can roughly be divided into three groups. The group of assays where the orientation and functionality of the immobilised reagent is of importance for the success of the experiment. The group of assays where the orientation and functionality of the immobilised reagent is not of vital importance for the success of the experiment. The third group of assays is then be where the immobilised reagent is a complex mixture of high concentration. The most obvious examples of these would then be the antibody array, the protein fragment or peptide array, and the reverse phase array. To be able to perform any of these three general types of experiments all special equipment that is needed is an instrument 40
Concluding Remarks
for producing the arrays, a scanner, and a healthy dose of curiosity and love for a good challenge. In the four papers described under “Present Investigations” I have detailed some of the work that has gone into understanding some of these technologies and the development of our own platforms for arraying and analysing reagents. In paper I and III are examples of developing and discovering the possibilities and limitations of the reverse phase format while using the instrumentation at hand, rather than the instrumentation that would have been optimal for the task. In paper I we were limited by the equipment we had at hand and, as all good engineers and scientist do, did the best we could with the equipment at our disposal and the final result met our goal. However, after paper I there were still unresolved questions in regard to accuracy and robustness of our reverse phase platform. Those questions were then answered to some extent during the work that laid the foundation for paper II. What these two paper show was that it was possible to do high-throughput screening of blood-derived samples with high reproducibility, for proteins that could be considered high and medium abundant. They also showed the impact and importance of being able to perform all these tests simultaneously, as the reproducibility within and between experiments were more robust for the array-assays compared to the “gold standard” ELISA. Today a new reverse phase platform is being developed using more suitable instrumentation, and we are taking another step in the evolution of our microarray platforms analytical space. In paper II and III some of the work that we have performed on the development and scaleup of our protein microarray platform for high-throughput validation and profiling of affinity reagents and biofluids is detailed. There is a remarkable difference between repeatedly performing a routine analysis and performing analysis using unknown reagents, especially when the work has to be performed in a high-throughput manner. This was the main issue in paper II as we performed validation experiments of hundreds of reagents with unknown properties in short time. By applying the knowledge we had previously gained by the routine analysis of HPA antibodies, as to what a successful analysis could be expected to look like, we were able to develop and implement several new protocols using different buffers and detection reagents. This also meant that we got confirmation that our routine analysis
41
Concluding Remarks
performed within HPA was a valid analysis method also for other types of affinity reagents, and that the setup was functional in a broader affinity reagent context. The arrays that are routinely produced for the validation of HPA antibodies have also found their use in other applications, such as screening for targets for autoantibodies in autoimmune disease contexts. One of the difficulties encountered when trying to analyse a sample with entirely unknown interaction partners on microarrays containing few possible targets, is the problem with identifying a functional experiment if there are no targets that results in clearly defined signals in the array. In such a scenario one might be inclined to believe that the experimental protocol is at fault, and either abandon the project or start unnecessary optimisation attempts. The need for larger arrays for screening is what drove the development of the large-scale and high-density protein fragment array detailed in paper IV. The combination of an arraying buffer with low vapour pressure and a high-speed arraying instrument lead to the possibility to array 384-microtiterplates with protein fragments in a fast and efficient manner, while minimising evaporation and sample consumption. We could thus store plates full of protein fragments that had been produced over a long period of time and accrue a large library of protein fragments for construction of large-scale protein microarrays. These microarrays that comprises tens of thousands of features have since proven to be useful for both validation and profiling of HPA antibodies, as well as autoimmunity profiling of human serum. The background behind these four papers are more thoroughly described in the chapter Present Investigations and the articles themselves are included in the appendix.
42
Present Investigations
Present Investigations Paper I - Screening for C3 Deficiency in Newborns Using Microarrays.
Aims of the investigation Complement factor 3 (C3) deficiency is an inherited defect of the immune system and might result in organ damage or death due to severe infections. It is caused by mutations in the C3 gene resulting in absence or diminished levels of C3, or a dysfunctional variant of C3. The disorder is rare, and is commonly sorted under the larger group of inherited defects of the immune system that is called “primary immunodeficiency disorders”. The deficiency is mostly noted in infancy due to life-threatening infections, and neonatal screening for detection of C3 deficiency before the onset of severe infection would be desirable for administration of prophylactic treatment to reduce of adverse effects. We had a previously established platform for screening of C3 deficiency in serum in a reverse phase protein array (RPPA) format, and we now aimed to extend that platform to eluates from dried blood spot samples (DBSS).The reason for using DBSS were that it is are routinely collected from newborns at birth and are already widely used for neonatal screening. The results that we acquired from the RPPA format were compared to values acquired through ELISA, for confirmation of relevance of the findings for both serum and DBSS eluates.
Summary of findings Two C3 deficient serum samples and twelve control serum samples were analysed for their C3 levels by both RPPA and ELISA. No C3 were detected for the C3 deficient samples in either of the platforms, while both platforms detected C3 in the control samples. Two C3 deficient DBSS eluates and 278 control DBSS eluates were arrayed together with the serum samples. No C3 were detected in either of the deficient DBSS samples while C3 were readily detected in most of the control DBSS samples. Some of the control DBSS samples showed low levels of C3 on the RPPA as compared to ELISA and were reprinted six times for a total of up to 47 replicates. The protein content of these six discrepant samples were then determined by SDS-PAGE and 43
Present Investigations
compared to normal controls. Some low molecular weight bands present in the discrepant samples were noted, suggesting degradation of C3.
Results and conclusions The correlation between RPPA and ELISA was generally low. However, the RPPA showed considerably lower coefficient of variation (15±6%) compared to the ELISA (29±13%). This might be explained by the lack of batch effects in the RPPA format, but still did not explain the consistently low RPPA values for six of the controls. Both the RPPA and the ELISA used polyclonal anti-C3 antibodies for detection of C3 so it remains to be determined why the ELISA would more readily detect degraded C3, in combination with a generally higher variation. Unfortunately this question could not be fully answered and remained unanswered. Overall the RPPA method developed for DBSS proved to be able to detect deficiencies, although it seemed sensitive to false positives, making it useful as a screening method for C3 deficiency. However, the RPPA would then need to be followed up by validation of identified deficiencies through an alternative platform.
Figure 7: Data correlation between RPPA and ELISA assays for C3 levels in DBSS. The correlation between serum microarray intensities and ELISA values for the Swedish C3 deficient patient (green circle) and 269 controls. Six samples (marked in red) show low intensities on the microarrays when compared to C3 (g/l). The results for the low intensity samples are based on 6 printings and a total of 20-47 replicates.
44
Present Investigations
Paper II - Validation of Affinity Reagents Using Antigen Microarrays. Aims of the investigation This investigation was initiated in the context of the SH2-consortium14, which were an international program to generate a complete set of affinity reagents to SH2-containing human proteins. The consortium comprised several research groups developing different types of affinity reagents towards the 110 known human SH2-containing proteins encoded in the human genome. There were two sources for production of antigens representing the SH2containing proteins, the Structural Genomics Consortium147 (SGC) that produced the SH2domains and the Human Protein Atlas52 (HPA) that produced protein fragments. The affinity reagents towards these antigens were generated by five research groups, as well as multiple companies, and comprised antigen purified polyclonal rabbit antibodies, mouse monoclonal antibodies and recombinant single-chain variable fragments (ScFv). In total 64 SH2-domain containing proteins were represented by 105 antigens, and 398 affinity reagents towards these antigens were generated. Due to the varying origin and production methods of the antigens and affinity reagents, a versatile high-throughput protein microarray setup that could be used for validation of reagents specificity had to be established. We had previously developed the protein microarray for validation of antibodies produced within HPA, and now broadened the applicability of this platform to comprise more antigens and other methods of detection. To achieve this, the 105 SH2-antigens were supplemented with 301 non-SH2 protein fragments and arrayed these antigens in 14 identical subarrays per slide, and performed optimisation for each set of binders to establish working protocols. This encompassed establishing suitable dilutions of reagents and compatible buffers.
45
Present Investigations
Figure 8: Design of the SH2-based antigen microarrays. (a) The layout and distribution of the 432 spotted antigens. The red spots corresponds to SH2 related HPA antigens and the blue spots corresponds to non-SH2 HPA antigens. The green spots indicate where the SGC SH2-domains are spotted. (b) Quality assurance of the produced antigen microarrays is performed with a chicken anti-His6ABP antibody, visualising the presence of all HPA produced antigens.
Summary of findings Despite the complexity of performing high-throughput validation of a diverse set of affinity reagents, we could identify reagents from all providers towards all SH2-domain antigens. Using our microarrays we could determine that approximately 50% of the affinity reagents had binding profiles where the most prominent signal was from their intended target, and only approximately 10% had low or no interaction with their intended target while displaying reactivity with other antigens.
46
Present Investigations
Figure 9: Schematic view of the results from validation of a monoclonal antibody towards protein GRB2. The y-axis in the histogram shows the relative signal intensity from each arrayed antigen. The x-axis contains all antigens in the order that they were arrayed with the recombinant protein fragments produced in the HPA to the left as single replicates and the domains produced by SGC to the right in duplicates. The signal from the HPA GRB2 antigens is shown as green bars, and the signals from the SGC domain is shown as red bars. Off-targets interactions are shown as black bars. In the sequence plot displayed below the histogram the full length protein is drawn as a horizontal bar coloured in grey with the endpoints of the SH2-domain marked with blue vertical bars. The HPA protein fragments are drawn as green horizontal bars and the SGC SH2-domain as a red horizontal bar. All protein fragments have their first and last amino acid written next to them, as corresponding to the full-length protein. This example shows that the monoclonal antibody that were produced against the SGC produced SH2domain of GRB2 recognises the proper target, and that the epitope is likely to reside in the 50 amino acid stretch between amino acid 109 and 159.
Results and conclusions Our investigation were triggered by the need of a high-throughput platform for the validation of on-target interactions of a large number of diverse affinity reagents from multiple providers. At first glance this task seemed daunting as we recognised the need for combining a high-throughput workflow with multiple optimisations for reagents that we had little to no previous experience with. We approached this task by using our established platform as a springboard to launch optimisation assays that were needed to accommodate the different reagents. We also recognised the need for previous experience of microarray method development and image analysis to interpret results and adjust protocols accordingly. Approximately 600 tests including optimisations, were done for the 398 affinity reagents. This means that less than two test per reagent were needed. This clearly showed the strength of protein microarrays as validation tools for affinity reagent production. 47
Present Investigations
Paper III - Biosensor Based Protein Profiling on Reverse Phase Serum Microarray.
Aims of the investigation This investigation was an extension of our previous work using RPPAs. As mentioned in conjunction with the paper I, we had noticed a discrepancy between the results obtained using the RPPA and the ELISA. We had now produced a new RPPA with almost 2.500 serum samples for investigating the feasibility of screening for IgA-deficiency using this platform. We once again noticed a noticeable discrepancy between the data obtained through the RPPA and the ELISA. We had been offered the opportunity to evaluate the feasibility of using a novel instrument for our work and to develop new assays for the platform. We therefore decided to bring this investigation into this biosensor based microarray platform, to confirm the previous results. The biosensor based platform utilised surface plasmon resonance (SPR) to measure the difference in refraction that occurs as the mass changes when a ligand binds to a protein on a metal surface. The output after data acquisition are full binding curves that allow for extraction of interaction kinetics. The instrument that we been given access to had the capability to measure 400 binding curves simultaneous for six consecutive ligand injections for up to a total of 2.400 binding curves per chip.
Summary of findings A total of 2.423 serum samples were arrayed on glass slides and analysed for their IgA content. 182 of those samples were then re-arrayed on glass slides and arrayed on sensor chips for the SPR measurement. This set of 182 samples comprised 28 samples that had been identified as IgA deficient using both the RPPA and the ELISA, 100 samples that showed large discrepancy between the RPPA and the ELISA, and 54 samples that were randomly chosen from the sample set. The within-slide and between-slide reproducibility of the glass slides were investigated as well as the reproducibility between printings. For the SPR chips the between-chip reproducibility was also investigated. The correlation between the three platforms, RPPA, ELISA, SPR, were also assessed.
48
Present Investigations
Figure 10: Reproducibility of the label-based RPPA platform. A) Plotting the spot replicates within each slide in a correlation plot shows a correlation of R = 0,98 between spot replicates. B) Plotting two slide replicates shows a correlation coefficient of R = 0,98 also between slides, albeit with a lower correlation for samples with higher target concentration. C) Plotting the 182 samples that were reprinted and reanalysed on the fluorescence based arrays shows a correlation coefficient of R = 0,9. This indicates that the spotting procedure was robust and with acceptable reproducibility both within each print run as well as between print runs.
Results and conclusions No replicates were performed on the ELISA platform unless a sample showed a large discrepancy between the RPPA and the ELISA, due to the time and sample consuming nature of the ELISA. Consequently, no proper investigation into the reproducibility of the ELISA could be included in the final paper. The reproducibility within the RPPA platform and the SPR platform, and between the two array platforms, proved to be excellent. A Spearman correlation over 0,9 was obtained for the within platform comparison, and just below 0,9 for the between platform comparison. The ELISA showed less agreement with a correlation of only 0,66 (ELISA vs. RPPA) and 0,76 (ELISA vs. SPR), and this was shown by the correlation plots as seen in figure 11. We had previously suspected that the discrepancy between the RPPA and the ELISA was driven by the ELISA data, and the new platform seemed to confirm this. The coefficient of variation for any of the platforms were not included in the final article due to the lack of appropriate replicates for the ELISA platform. However, for those samples that there were replicates performed on the ELISA the CV was 50%, while for the RPPA and the SPR they were 17% and 5% respectively. Many of the samples that were replicated on the ELISA were samples that were chosen to be reanalysed due to their discrepancy with their RPPA result. Therefore the observed CV were not be deemed fully representative of the platform.
49
Present Investigations
Figure 11: Correlation between the RPPA-platforms and the ELISA. A) The two microarray-based platforms show acceptable agreement with a correlation coefficient of R = 0,89. B) A correlation plot between fluorescence-based array and ELISA show less agreement with a correlation coefficient of R = 0,66. C) The correlation plot between SPR-based array and ELISA also show less agreement between platforms then the two array-based platforms with a correlation coefficient of R = 0,76.
The SPR platform proved to generate data with CV < 5% and with a possibility to extract full binding kinetics if necessary. It was consequently deemed to be useful for measuring IgA in serum in a RPPA-like format. As a side note to that is that we previously investigated the instruments applicability for detection of proteins in serum in a capture array format. However, this resulted in little success due to unsurmountable difficulties with clogging of the fluidics system in the instrument. After several failed attempts it was revealed that it was a known problem for the instrument and was due to the high protein concentration in serum. Another drawback with the instrument was the extraordinarily large sample volume of 1,6 ml that were needed for injection, which precluded the use of the instrument for many applications. The instrument was later withdrawn from the market as other instruments with similar function that were less sample demanding became more widespread.
50
Present Investigations
Paper IV - Exploration of High-density Protein Microarrays for Antibody Validation and Autoimmunity Profiling.
Aims of the investigation All antibodies generated within the Human Protein Atlas (HPA) are validated for their ontarget binding with protein microarrays that comprise of their cognate recombinant protein fragment and 383 other fragments, which are currently in the HPA antibody production pipeline. There are new batches of 384 fragments arrayed regularly for continuous testing of cognate antibodies. The 384-well microplates that contains the protein fragments can be stored and reused for production of new microarrays, and multiple batches can thus be combined to produce larger arrays. We took advantage of this possibility and produced a set of arrays containing 11.520 protein fragments with two subarrays per slide, and a set of arrays containing 21.120 protein fragments in one subarray per slide. This undertaking was initiated for the advancement of our protein microarray platform and for investigating the relevance of the classifications that are assigned to each antibody during validation in the HPA.
Summary of findings We profiled 48 polyclonal antibodies on arrays with 384 and 11.520 protein fragments, and 10 of those antibodies were further analysed on arrays with 21.120 protein fragments. Nine serum samples from patients with secondary progressive multiple sclerosis were also analysed for their autoimmunity reactivity profile on the arrays with 11.520 protein fragments, and two of those were further profiled on the arrays with 21.120 protein fragments. Two of the antibodies and two of the serum samples were finally profiled on a commercially available protein microarray platform comprising over 17.000 full-length proteins. The 48 polyclonal antibodies that were profiled on arrays with 384 and 11.520 protein fragments showed increasing spread of off-target interactions on 11.520 protein fragments for antibodies with an increasing number of off-target interactions for 384 protein fragments, as seen in figure 12A. This indicates that the routine validation performed on protein fragment arrays in the HPA is able to detect antibodies with a high number of off-target interactions towards complex samples. A subset of those antibodies that showed one single off-target interaction >40% of the on-target interaction, and a subset of the group antibodies that showed no off-target interactions, were further analysed on the 51
Present Investigations arrays with 21.120 protein fragments, as seen in figure 12B . This revealed that the antibodies with a single strong off-target interaction does not necessarily result in a large increase of off-target interaction as compared to those antibodies that showed no off-target interaction on 384 protein fragments. Consequently, those antibodies with only one strong interaction on 384 protein fragments may not be necessary to fail in the validation step. One example that illustrates the increasing amount of off-target interactions that could be expected from an antibody as it is exposed to more possible epitopes is shown in figure 13, with barplots from 384, 11.520 and 21.120 protein fragments and the full-length protein microarray. The lack of off-target interactions towards full length proteins implies that epitopes detected in the protein fragments might be hidden in the full-length folded proteins, thus making the antibody more specific then what is implied on the protein fragment arrays.
Figure 12: The number of off-target interactions with a relative signal >15% of the on-target interaction. A) On the x-axis is the number of off-target interactions (> 15% of on-target) on arrays with 384 protein fragments plotted in groups of zero to more than three interactions and with an additional group for those that showed on strong (>40% of the on-target interaction). On the y-axis is the number of off-target interactions on arrays with 11.520 protein fragments. B) The green connected dots are five antibodies that had no off-target interactions on 384 protein fragments and the red connected dots are five antibodies with one off-target interaction >40% of the on-target interaction on 384 protein fragments.
Results and conclusions The assembly of 384-well microplates for production of arrays for validation of antibodies enabled the generation of large-scale arrays with 21.120 protein fragments. These large-scale arrays can be further expanded and hold the potential to become an important addition to our toolbox for profiling both affinity reagents biofluids. Using these arrays we could show that the validation performed in the HPA carries relevance when considering large collections of antigens. The production of these arrays also hints at the feasibility of constructing large antibody arrays using the HPA antibodies as these antibodies are also stored in microplates, and in a buffer suitable for direct arraying. We also had the opportunity to compare our results from both antibodies and autoimmunity samples to a full-length protein microarray. That showed that the full-length protein microarrays useful for validation of on-target binding and 52
Present Investigations
screening of interaction partners of antibodies, and that they add value to our autoimmunity screening.
Figure 13: The first three barplots shows the binding profile for one antibody on arrays with 384, 11.520, and 21.120 protein fragments. The initial binding profile on 384 protein fragments revealed a signal off-target interaction. As the number of protein fragments in the array increased the number of identified off-target interactions increases as well with the correct antigen marked by red bars and the initial off-target interaction marked in black. At the bottom is the same antibody tested on full length protein microarrays containing approximately 17.000 full length proteins where only one strong off-target interaction was detected.
53
Future Perspectives
Future Perspectives The work in this thesis provides details to the work that constitute the foundation of the protein microarray platform. The platform have undergone a steady evolution from what we today consider small-scale arrays towards more, bigger, faster, and better. And the work will of course not end now. An antigen array comprising of almost all antigens produced within the HPA will soon be in the making. This will give us unprecedented possibilities for profiling of reagents and samples, with over 40.000 unique recombinant protein fragments as targets. To achieve this, new arraying procedures and new protocols for analysis will have to be developed. On the reverse phase array platform we aim to array a complete biobank of approximately 12.000 serum samples for investigation of higher abundant protein levels looking across various diseases. This would then come to represent maybe the largest RPPA ever produced so far, and will enable new and exciting research venues for the protein microarray platform. Last, but definitely not least, looms the possibility of arraying one of the world’s largest collections of antibodies. As the Human Protein Atlas on November 6th 2014 summarised its knowledge of the human proteome it comprised proteome analysis of almost 17.000 unique proteins that have been analysed using approximately 24.000 antibodies. Thousands of these antibodies, together with many more that have not yet been published, are currently accessible from plates that has previously been available for the validation. One can envision a transfer of aliquots of these antibodies into plates suitable for arraying that will enable the generation of capture arrays with tens of thousands of antibodies. This will then allow for profiling of protein levels for thousands of proteins simultaneously on a single slide, which will open up entirely new possibilities for proteomic research. The field of affinity proteomics is set to continue growing and evolve and I hope to continue as a part of these efforts that advances it into the future by discovering new uses and by developing new applications for protein microarrays.
54
Acknowledgements
Acknowledgements Peter, min huvudhandledare som har gett mig frihet att vara kreativ och testa det jag vill testa utan att ifrågasätta vad alla plottar på buffertar, morfologier, och bakgrundssignaler egentligen gör för nytta. Jochen, min bi-handledare, för fruktbara diskussioner om ditten och datten och för rödpennan för korrigering av mitt slarviga skrivande. Mathias, för att du har samlat alla dessa människor under proteinatlasets paraply Alla i HPA och AlbaNova som jag har jobbat med under dessa år. PAPP, från att bara vara PA så har vi blivit PAPP, och ju fler desto roligare. Burcu, som har stått ut med mig som kontorsgranne och med mitt konstanta klagande över dåliga studenter, dåliga labb-rapporter, dåliga artiklar, dåliga experiment och vad helst annat jag har hittat att gnälla över. Mårten, som utgjorde PA-trion med Peter och mig i början av min tid här, och som jag har spätt många antikroppar med och förundrat mig över så många konstiga printningar tillsammans med. Mamma Siv, en lång och krokig väg har det varit och utan ditt stöd så hade jag inte tagit mig hit.
55
References
References 1. 2.
3.
4. 5. 6.
7. 8.
9.
10. 11.
12. 13.
14. 15.
16. 17. 18.
56
Watson, J.D. & Crick, F.H. Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature 171, 737-738 (1953). McCarty, M. & Avery, O.T. Studies on the Chemical Nature of the Substance Inducing Transformation of Pneumococcal Types : Iii. An Improved Method for the Isolation of the Transforming Substance and Its Application to Pneumococcus Types Ii, Iii, and Vi. J Exp Med 83, 97-104 (1946). Gerstein, M.B., Bruce, C., Rozowsky, J.S., Zheng, D., Du, J., Korbel, J.O., Emanuelsson, O., Zhang, Z.D., Weissman, S. & Snyder, M. What is a gene, post-ENCODE? History and updated definition. Genome Res 17, 669-681 (2007). Morris, K.V. & Mattick, J.S. The rise of regulatory RNA. Nat Rev Genet 15, 423-437 (2014). Crick, F. Central dogma of molecular biology. Nature 227, 561-563 (1970). Wilkins, M.R., Pasquali, C., Appel, R.D., Ou, K., Golaz, O., Sanchez, J.C., Yan, J.X., Gooley, A.A., Hughes, G., Humphery-Smith, I., Williams, K.L. & Hochstrasser, D.F. From proteins to proteomes: large scale protein identification by two-dimensional electrophoresis and amino acid analysis. Bio/technology 14, 61-65 (1996). Consortium, E.P. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57-74 (2012). Ponten, F., Gry, M., Fagerberg, L., Lundberg, E., Asplund, A., Berglund, L., Oksvold, P., Bjorling, E., Hober, S., Kampf, C., Navani, S., Nilsson, P., Ottosson, J., Persson, A., Wernerus, H., Wester, K. & Uhlen, M. A global view of protein expression in human cells, tissues, and organs. Mol Syst Biol 5, 337 (2009). Gry, M., Rimini, R., Stromberg, S., Asplund, A., Ponten, F., Uhlen, M. & Nilsson, P. Correlations between RNA and protein expression profiles in 23 human cell lines. BMC Genomics 10, 365 (2009). Anderson, N.L. & Anderson, N.G. The human plasma proteome: history, character, and diagnostic prospects. Mol Cell Proteomics 1, 845-867 (2002). Schroder, C., Jacob, A., Tonack, S., Radon, T.P., Sill, M., Zucknick, M., Ruffer, S., Costello, E., Neoptolemos, J.P., Crnogorac-Jurcevic, T., Bauer, A., Fellenberg, K. & Hoheisel, J.D. Dual-color proteomic profiling of complex samples with a microarray of 810 cancer-related antibodies. Mol Cell Proteomics 9, 1271-1280 (2010). Stoevesandt, O. & Taussig, M.J. Affinity proteomics: the role of specific binding reagents in human proteome analysis. Expert Rev Proteomics 9, 401-414 (2012). Larsson, M., Brundell, E., Nordfors, L., Hoog, C., Uhlen, M. & Stahl, S. A general bacterial expression system for functional analysis of cDNA-encoded proteins. Protein Expr Purif 7, 447-457 (1996). Colwill, K. & Graslund, S. A roadmap to generate renewable protein binders to the human proteome. Nat Methods 8, 551-558 (2011). Litman, G.W., Rast, J.P., Shamblott, M.J., Haire, R.N., Hulst, M., Roess, W., Litman, R.T., HindsFrey, K.R., Zilch, A. & Amemiya, C.T. Phylogenetic diversification of immunoglobulin genes and the antibody repertoire. Molecular Biology and Evolution 10, 60-72 (1993). Market, E. & Papavasiliou, F.N. V(D)J recombination and the evolution of the adaptive immune system. PLoS Biol 1, E16 (2003). Woof, J.M. & Burton, D.R. Human antibody-Fc receptor interactions illuminated by crystal structures. Nat Rev Immunol 4, 89-99 (2004). Ehrenstein, M.R. & Notley, C.A. The importance of natural IgM: scavenger, protector and regulator. Nat Rev Immunol 10, 778-786 (2010).
References
19. 20. 21. 22. 23. 24. 25.
26. 27. 28. 29. 30. 31. 32. 33.
34. 35. 36. 37. 38.
39. 40. 41.
Vollmers, H.P. & Brandlein, S. Natural IgM antibodies: from parias to parvenus. Histol Histopathol 21, 1355-1366 (2006). Vidarsson, G., Dekkers, G. & Rispens, T. IgG subclasses and allotypes: from structure to effector functions. Front Immunol 5, 520 (2014). Novak, N., Kraft, S. & Bieber, T. IgE receptors. Current Opinion in Immunology 13, 721-726 (2001). Macpherson, A.J., McCoy, K.D., Johansen, F.E. & Brandtzaeg, P. The immune geography of IgA induction and function. Mucosal Immunol 1, 11-22 (2008). Delacroix, D.L., Dive, C., Rambaud, J.C. & Vaerman, J.P. IgA subclasses in various secretions and in serum. Immunology 47, 383-385 (1982). Edholm, E.-S., Bengten, E. & Wilson, M. Insights into the function of IgD. Developmental & Comparative Immunology 35, 1309-1316 (2011). Lipman, N.S., Jackson, L.R., Trudel, L.J. & Weis-Garcia, F. Monoclonal versus polyclonal antibodies: distinguishing characteristics, applications, and information resources. ILAR journal / National Research Council, Institute of Laboratory Animal Resources 46, 258-268 (2005). Rockberg, J., Schwenk, J.M. & Uhlen, M. Discovery of epitopes for targeting the human epidermal growth factor receptor 2 (HER2) with antibodies. Mol Oncol 3, 238-247 (2009). Ehrlich, P.H., Moyle, W.R., Moustafa, Z.A. & Canfield, R.E. Mixing two monoclonal antibodies yields enhanced affinity for antigen. J Immunol 128, 2709-2713 (1982). Milstein, C. The hybridoma revolution: an offshoot of basic research. BioEssays 21, 966-973 (1999). Kohler, G. & Milstein, C. Continuous cultures of fused cells secreting antibody of predefined specificity. Nature 256, 495-497 (1975). Hudson, P.J. Recombinant antibody fragments. Current Opinion in Biotechnology 9, 395-402 (1998). Nelson, A.L. Antibody fragments: hope and hype. mAbs 2, 77-83 (2010). Nygren, P.A. & Skerra, A. Binding proteins from alternative scaffolds. J Immunol Methods 290, 3-28 (2004). Nord, K., Gunneriusson, E., Ringdahl, J., Stahl, S., Uhlen, M. & Nygren, P.A. Binding proteins selected from combinatorial libraries of an alpha-helical bacterial receptor domain. Nat Biotechnol 15, 772-777 (1997). Schlehuber, S. & Skerra, A. Anticalins as an alternative to antibody technology. Expert opinion on biological therapy 5, 1453-1462 (2005). Koide, A., Bailey, C.W., Huang, X. & Koide, S. The fibronectin type III domain as a scaffold for novel binding proteins. J Mol Biol 284, 1141-1151 (1998). Stumpp, M.T., Binz, H.K. & Amstutz, P. DARPins: a new generation of protein therapeutics. Drug discovery today 13, 695-701 (2008). Proske, D., Blank, M., Buhmann, R. & Resch, A. Aptamers--basic research, drug development, and clinical applications. Appl Microbiol Biotechnol 69, 367-374 (2005). Kenyon, G.L., DeMarini, D.M., Fuchs, E., Galas, D.J., Kirsch, J.F., Leyh, T.S., Moos, W.H., Petsko, G.A., Ringe, D., Rubin, G.M., Sheahan, L.C. & National Research Council Steering, C. Defining the mandate of proteomics in the post-genomics era: workshop report. Mol Cell Proteomics 1, 763-780 (2002). Uhlen, M. Affinity as a tool in life science. Biotechniques 44, 649-654 (2008). Uhlen, M. & Ponten, F. Antibody-based proteomics for human tissue profiling. Mol Cell Proteomics 4, 384-393 (2005). Ayoglu, B., Haggmark, A., Neiman, M., Igel, U., Uhlen, M., Schwenk, J.M. & Nilsson, P. Systematic antibody and antigen-based proteomic profiling with microarrays. Expert Rev Mol Diagn 11, 219-234 (2011).
57
References 42. 43.
44. 45.
46.
47.
48.
49. 50.
51.
52.
53.
54. 55.
56.
57. 58
Schwenk, J.M., Gry, M., Rimini, R., Uhlen, M. & Nilsson, P. Antibody suspension bead arrays within serum proteomics. J Proteome Res 7, 3168-3179 (2008). Nishizuka, S., Charboneau, L., Young, L., Major, S., Reinhold, W.C., Waltham, M., KourosMehr, H., Bussey, K.J., Lee, J.K., Espina, V., Munson, P.J., Petricoin, E., 3rd, Liotta, L.A. & Weinstein, J.N. Proteomic profiling of the NCI-60 cancer cell lines using new high-density reverse-phase lysate microarrays. Proc Natl Acad Sci U S A 100, 14229-14234 (2003). Stoevesandt, O. & Taussig, M.J. European and international collaboration in affinity proteomics. N Biotechnol 29, 511-514 (2012). Taussig, M.J., Stoevesandt, O., Borrebaeck, C.A., Bradbury, A.R., Cahill, D., Cambillau, C., de Daruvar, A., Dubel, S., Eichler, J., Frank, R., Gibson, T.J., Gloriam, D., Gold, L., Herberg, F.W., Hermjakob, H., Hoheisel, J.D., Joos, T.O., Kallioniemi, O., Koegl, M., Konthur, Z., Korn, B., Kremmer, E., Krobitsch, S., Landegren, U., van der Maarel, S., McCafferty, J., Muyldermans, S., Nygren, P.A., Palcy, S., Pluckthun, A., Polic, B., Przybylski, M., Saviranta, P., Sawyer, A., Sherman, D.J., Skerra, A., Templin, M., Ueffing, M. & Uhlen, M. ProteomeBinders: planning a European resource of affinity reagents for analysis of the human proteome. Nat Methods 4, 13-17 (2007). Haab, B.B., Paulovich, A.G., Anderson, N.L., Clark, A.M., Downing, G.J., Hermjakob, H., Labaer, J. & Uhlen, M. A reagent resource to identify proteins and peptides of interest for the cancer community: a workshop report. Mol Cell Proteomics 5, 1996-2007 (2006). Juncker, D., Bergeron, S., Laforte, V. & Li, H. Cross-reactivity in antibody microarrays and multiplexed sandwich assays: shedding light on the dark side of multiplexing. Current opinion in chemical biology 18, 29-37 (2014). Schwenk, J.M., Igel, U., Neiman, M., Langen, H., Becker, C., Bjartell, A., Ponten, F., Wiklund, F., Gronberg, H., Nilsson, P. & Uhlen, M. Toward next generation plasma profiling via heatinduced epitope retrieval and array-based assays. Mol Cell Proteomics 9, 2497-2507 (2010). MacBeath, G. Protein microarrays and proteomics. Nat Genet 32 Suppl, 526-532 (2002). Poetz, O., Henzler, T., Hartmann, M., Kazmaier, C., Templin, M.F., Herget, T. & Joos, T.O. Sequential multiplex analyte capturing for phosphoprotein profiling. Mol Cell Proteomics 9, 2474-2481 (2010). Landegren, U., Vanelid, J., Hammond, M., Nong, R.Y., Wu, D., Ulleras, E. & KamaliMoghaddam, M. Opportunities for sensitive plasma proteome analysis. Anal Chem 84, 18241830 (2012). Uhlen, M., Fagerberg, L., Hallstrom, B.M., Lindskog, C., Oksvold, P., Mardinoglu, A., Sivertsson, A., Kampf, C., Sjostedt, E., Asplund, A., Olsson, I., Edlund, K., Lundberg, E., Navani, S., Szigyarto, C.A., Odeberg, J., Djureinovic, D., Takanen, J.O., Hober, S., Alm, T., Edqvist, P.H., Berling, H., Tegel, H., Mulder, J., Rockberg, J., Nilsson, P., Schwenk, J.M., Hamsten, M., von Feilitzen, K., Forsberg, M., Persson, L., Johansson, F., Zwahlen, M., von Heijne, G., Nielsen, J. & Ponten, F. Proteomics. Tissue-based map of the human proteome. Science 347, 1260419 (2015). Uhlen, M., Oksvold, P., Fagerberg, L., Lundberg, E., Jonasson, K., Forsberg, M., Zwahlen, M., Kampf, C., Wester, K., Hober, S., Wernerus, H., Bjorling, L. & Ponten, F. Towards a knowledge-based Human Protein Atlas. Nat Biotechnol 28, 1248-1250 (2010). Ponten, F., Schwenk, J.M., Asplund, A. & Edqvist, P.H. The Human Protein Atlas as a proteomic resource for biomarker discovery. J Intern Med 270, 428-446 (2011). Mulder, J., Bjorling, E., Jonasson, K., Wernerus, H., Hober, S., Hokfelt, T. & Uhlen, M. Tissue profiling of the mammalian central nervous system using human antibody-based proteomics. Mol Cell Proteomics 8, 1612-1622 (2009). Berglund, L., Bjorling, E., Jonasson, K., Rockberg, J., Fagerberg, L., Al-Khalili Szigyarto, C., Sivertsson, A. & Uhlen, M. A whole-genome bioinformatics approach to selection of antigens for systematic antibody generation. Proteomics 8, 2832-2839 (2008). Berglund, L., Bjorling, E., Oksvold, P., Fagerberg, L., Asplund, A., Szigyarto, C.A., Persson, A., Ottosson, J., Wernerus, H., Nilsson, P., Lundberg, E., Sivertsson, A., Navani, S., Wester, K.,
References
58. 59.
60. 61. 62. 63. 64.
65. 66. 67. 68. 69.
70.
71.
72. 73.
74. 75. 76.
77.
Kampf, C., Hober, S., Ponten, F. & Uhlen, M. A genecentric Human Protein Atlas for expression profiles based on antibodies. Mol Cell Proteomics 7, 2019-2027 (2008). Berglund, L., Andrade, J., Odeberg, J. & Uhlen, M. The epitope space of the human proteome. Protein Sci 17, 606-613 (2008). Larsson, M., Gräslund, S., Yuan, L., Brundell, E., Uhlén, M., Höög, C. & Ståhl, S. Highthroughput protein expression of cDNA products as a tool in functional genomics. Journal of Biotechnology 80, 143-157 (2000). Feinberg, J.G. A 'microspot' test for antigens and antibodies. Nature 192, 985-986 (1961). Schena, M., Shalon, D., Davis, R.W. & Brown, P.O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270, 467-470 (1995). Ekins, R.P. Multi-analyte immunoassay. J Pharm Biomed Anal 7, 155-168 (1989). Templin, M.F., Stoll, D., Schrenk, M., Traub, P.C., Vohringer, C.F. & Joos, T.O. Protein microarray technology. Trends Biotechnol 20, 160-166 (2002). Kusnezow, W., Syagailo, Y.V., Ruffer, S., Baudenstiel, N., Gauer, C., Hoheisel, J.D., Wild, D. & Goychuk, I. Optimal design of microarray immunoassays to compensate for kinetic limitations: theory and experiment. Mol Cell Proteomics 5, 1681-1696 (2006). Kodadek, T. Protein microarrays: prospects and problems. Chemistry & biology 8, 105-115 (2001). Zhu, H. & Snyder, M. Protein chip technology. Current opinion in chemical biology 7, 55-63 (2003). LaBaer, J. & Ramachandran, N. Protein microarrays as tools for functional proteomics. Current opinion in chemical biology 9, 14-19 (2005). Phizicky, E., Bastiaens, P.I., Zhu, H., Snyder, M. & Fields, S. Protein analysis on a proteomic scale. Nature 422, 208-215 (2003). Paweletz, C.P., Charboneau, L., Bichsel, V.E., Simone, N.L., Chen, T., Gillespie, J.W., EmmertBuck, M.R., Roth, M.J., Petricoin, I.E. & Liotta, L.A. Reverse phase protein microarrays which capture disease progression show activation of pro-survival pathways at the cancer invasion front. Oncogene 20, 1981-1989 (2001). Sheehan, K.M., Calvert, V.S., Kay, E.W., Lu, Y., Fishman, D., Espina, V., Aquino, J., Speer, R., Araujo, R., Mills, G.B., Liotta, L.A., Petricoin, E.F., 3rd & Wulfkuhle, J.D. Use of reverse phase protein microarrays and reference standard development for molecular network analysis of metastatic ovarian carcinoma. Mol Cell Proteomics 4, 346-355 (2005). Zhu, H., Bilgin, M., Bangham, R., Hall, D., Casamayor, A., Bertone, P., Lan, N., Jansen, R., Bidlingmaier, S., Houfek, T., Mitchell, T., Miller, P., Dean, R.A., Gerstein, M. & Snyder, M. Global analysis of protein activities using proteome chips. Science 293, 2101-2105 (2001). Merbl, Y. & Kirschner, M.W. Protein microarrays for genome-wide posttranslational modification analysis. Wiley Interdiscip Rev Syst Biol Med 3, 347-356 (2011). Haab, B.B., Dunham, M.J. & Brown, P.O. Protein microarrays for highly parallel detection and quantitation of specific proteins and antibodies in complex solutions. Genome Biol 2, RESEARCH0004 (2001). Sjoberg, R., Sundberg, M., Gundberg, A., Sivertsson, A., Schwenk, J.M., Uhlen, M. & Nilsson, P. Validation of affinity reagents using antigen microarrays. N Biotechnol 29, 555-563 (2012). Rosenberg, J.M. & Utz, P.J. Protein microarrays: a new tool for the study of autoantibodies in immunodeficiency. Frontiers in Immunology 6 (2015). Joos, T.O., Schrenk, M., Hopfl, P., Kroger, K., Chowdhury, U., Stoll, D., Schorner, D., Durr, M., Herick, K., Rupp, S., Sohn, K. & Hammerle, H. A microarray enzyme-linked immunosorbent assay for autoimmune diagnostics. Electrophoresis 21, 2641-2650 (2000). Haggmark, A., Hamsten, C., Wiklundh, E., Lindskog, C., Mattsson, C., Andersson, E., Lundberg, I.E., Gronlund, H., Schwenk, J.M., Eklund, A., Grunewald, J. & Nilsson, P. Proteomic profiling reveals autoimmune targets in sarcoidosis. Am J Respir Crit Care Med 191, 574-583 (2015).
59
References 78. 79.
80. 81.
82. 83. 84.
85.
86.
87.
88. 89. 90. 91.
92.
93.
94. 95.
60
Robinson, W.H. Antigen arrays for antibody profiling. Current opinion in chemical biology 10, 67-72 (2006). Forsstrom, B., Axnas, B.B., Stengele, K.P., Buhler, J., Albert, T.J., Richmond, T.A., Hu, F.J., Nilsson, P., Hudson, E.P., Rockberg, J. & Uhlen, M. Proteome-wide epitope mapping of antibodies using ultra-dense peptide arrays. Mol Cell Proteomics 13, 1585-1597 (2014). Park, S., Gildersleeve, J.C., Blixt, O. & Shin, I. Carbohydrate microarrays. Chemical Society Reviews 42, 4310-4326 (2013). Wang, D., Liu, S., Trummer, B.J., Deng, C. & Wang, A. Carbohydrate microarrays for the recognition of cross-reactive molecular markers of microbes and host cells. Nat Biotechnol 20, 275-281 (2002). Pei, Z., Yu, H., Theurer, M., Walden, A., Nilsson, P., Yan, M. & Ramstrom, O. Photogenerated carbohydrate microarrays. Chembiochem 8, 166-168 (2007). Haab, B.B. Methods and applications of antibody microarrays in cancer research. Proteomics 3, 2116-2122 (2003). Peluso, P., Wilson, D.S., Do, D., Tran, H., Venkatasubbaiah, M., Quincy, D., Heidecker, B., Poindexter, K., Tolani, N., Phelan, M., Witte, K., Jung, L.S., Wagner, P. & Nock, S. Optimizing antibody immobilization strategies for the construction of protein microarrays. Anal Biochem 312, 113-124 (2003). Butler, J.E., Ni, L., Brown, W.R., Joshi, K.S., Chang, J., Rosenberg, B. & Voss, E.W., Jr. The immunochemistry of sandwich ELISAs--VI. Greater than 90% of monoclonal and 75% of polyclonal anti-fluorescyl capture antibodies (CAbs) are denatured by passive adsorption. Molecular immunology 30, 1165-1175 (1993). Miller, J.C., Zhou, H., Kwekel, J., Cavallo, R., Burke, J., Butler, E.B., Teh, B.S. & Haab, B.B. Antibody microarray profiling of human prostate cancer sera: antibody screening and identification of potential biomarkers. Proteomics 3, 56-63 (2003). Rimini, R., Schwenk, J.M., Sundberg, M., Sjöberg, R., Klevebring, D., Gry, M., Uhlén, M. & Nilsson, P. Validation of serum protein profiles by a dual antibody array approach. Journal of Proteomics 73, 252-266 (2009). Wingren, C. & Borrebaeck, C.A. Antibody microarray analysis of directly labelled complex proteomes. Curr Opin Biotechnol 19, 55-61 (2008). Wilson, J.J., Burgess, R., Mao, Y.-Q., Luo, S., Tang, H., Jones, V.S., Weisheng, B., Huang, R.-Y., Chen, X. & Huang, R.-P. in Advances in Clinical Chemistry (Elsevier. Sonntag, J., Schluter, K., Bernhardt, S. & Korf, U. Subtyping of breast cancer using reverse phase protein arrays. Expert Rev Proteomics 11, 757-770 (2014). Janzi, M., Kull, I., Sjoberg, R., Wan, J., Melen, E., Bayat, N., Ostblom, E., Pan-Hammarstrom, Q., Nilsson, P. & Hammarstrom, L. Selective IgA deficiency in early life: association to infections and allergic diseases during childhood. Clin Immunol 133, 78-85 (2009). Gyorgy, A.B., Walker, J., Wingo, D., Eidelman, O., Pollard, H.B., Molnar, A. & Agoston, D.V. Reverse phase protein microarray technology in traumatic brain injury. J Neurosci Methods 192, 96-101 (2010). Akbani, R., Becker, K.F., Carragher, N., Goldstein, T., de Koning, L., Korf, U., Liotta, L., Mills, G.B., Nishizuka, S.S., Pawlak, M., Petricoin, E.F., 3rd, Pollard, H.B., Serrels, B. & Zhu, J. Realizing the promise of reverse phase protein arrays for clinical, translational, and basic research: a workshop report: the RPPA (Reverse Phase Protein Array) society. Mol Cell Proteomics 13, 1625-1643 (2014). Taylor, S., Smith, S., Windle, B. & Guiseppi-Elie, A. Impact of surface chemistry and blocking strategies on DNA microarrays. Nucleic Acids Res 31, e87 (2003). VanMeter, A., Signore, M., Pierobon, M., Espina, V., Liotta, L.A. & Petricoin, E.F., 3rd Reversephase protein microarrays: application to biomarker discovery and translational medicine. Expert Rev Mol Diagn 7, 625-633 (2007).
References
96.
97.
98. 99.
100. 101. 102.
103. 104. 105.
106. 107.
108. 109.
110. 111. 112.
113.
114. 115.
Calvert, V., Tang, Y., Boveia, V., Wulfkuhle, J., Schutz-Geschwender, A., Michael Olive, D., Liotta, L. & Petricoin, E., III Development of multiplexed protein profiling and detection using near infrared detection of reverse-phase protein microarrays. Clin Proteom 1, 81-89 (2004). Barbulovic-Nad, I., Lucente, M., Sun, Y., Zhang, M., Wheeler, A.R. & Bussmann, M. Biomicroarray fabrication techniques--a review. Critical reviews in biotechnology 26, 237-259 (2006). Weibel, C. “The Spotting Accelerator™”, Customizable Head Assembly for Advanced Microarraying. Journal of the Association for Laboratory Automation 7, 89-94 (2002). McQuain, M.K., Seale, K., Peek, J., Levy, S. & Haselton, F.R. Effects of relative humidity and buffer additives on the contact printing of microarrays by quill pins. Analytical Biochemistry 320, 281-291 (2003). Austin, J. & Holway, A.H. Contact printing of protein microarrays. Methods Mol Biol 785, 379394 (2011). Fang, Y. Air stability of supported lipid membrane spots. Chemical Physics Letters 512, 258262 (2011). Nakanishi, K., Sakiyama, T. & Imamura, K. On the adsorption of proteins on solid surfaces, a common but very complicated phenomenon. Journal of bioscience and bioengineering 91, 233-244 (2001). Romanov, V., Davidoff, S.N., Miles, A.R., Grainger, D.W., Gale, B.K. & Brooks, B.D. A critical comparison of protein microarray fabrication technologies. Analyst 139, 1303-1326 (2014). McWilliam, I., Chong Kwan, M. & Hall, D. Inkjet printing for the production of protein microarrays. Methods Mol Biol 785, 345-361 (2011). Hartmann, M., Sjodahl, J., Stjernstrom, M., Redeby, J., Joos, T. & Roeraade, J. Non-contact protein microarray fabrication using a procedure based on liquid bridge formation. Anal Bioanal Chem 393, 591-598 (2009). Fodor, S.P., Read, J.L., Pirrung, M.C., Stryer, L., Lu, A.T. & Solas, D. Light-directed, spatially addressable parallel chemical synthesis. Science 251, 767-773 (1991). Buus, S., Rockberg, J., Forsstrom, B., Nilsson, P., Uhlen, M. & Schafer-Nielsen, C. Highresolution mapping of linear antibody epitopes using ultrahigh-density peptide microarrays. Mol Cell Proteomics 11, 1790-1800 (2012). He, M. & Taussig, M.J. Single step generation of protein arrays from DNA by cell-free expression and in situ immobilisation (PISA method). Nucleic Acids Res 29, E73-73 (2001). Schmidt, R., Cook, E.A., Kastelic, D., Taussig, M.J. & Stoevesandt, O. Optimised 'on demand' protein arraying from DNA by cell free expression with the 'DNA to Protein Array' (DAPA) technology. J Proteomics 88, 141-148 (2013). He, M., Stoevesandt, O. & Taussig, M.J. In situ synthesis of protein arrays. Curr Opin Biotechnol 19, 4-9 (2008). Ramachandran, N., Hainsworth, E., Bhullar, B., Eisenstein, S., Rosen, B., Lau, A.Y., Walter, J.C. & LaBaer, J. Self-assembling protein microarrays. Science 305, 86-90 (2004). Ramachandran, N., Raphael, J.V., Hainsworth, E., Demirkan, G., Fuentes, M.G., Rolfs, A., Hu, Y. & LaBaer, J. Next-generation high-density self-assembling functional protein arrays. Nat Methods 5, 535-538 (2008). Clark, D.C., Mackie, A.R., Wilde, P.J. & Wilson, D.R. Differences in the Structure and Dynamics of the Adsorbed Layers in Protein-Stabilized Model Foams and Emulsions. Faraday Discuss 98, 253-262 (1994). Deegan, R.D., Bakajin, O., Dupont, T.F., Huber, G., Nagel, S.R. & Witten, T.A. Capillary flow as the cause of ring stains from dried liquid drops. Nature 389, 827-829 (1997). Deng, Y., Zhu, X.Y., Kienlen, T. & Guo, A. Transport at the air/water interface is the reason for rings in protein microarrays. J Am Chem Soc 128, 2768-2769 (2006).
61
References 116.
117. 118.
119.
120. 121. 122.
123.
124. 125. 126.
127. 128. 129.
130. 131.
132.
133.
134. 62
Olle, E.W., Messamore, J., Deogracias, M.P., McClintock, S.D., Anderson, T.D. & Johnson, K.J. Comparison of antibody array substrates and the use of glycerol to normalize spot morphology. Experimental and Molecular Pathology 79, 206-209 (2005). Mackie, A.R., Gunning, A.P., Wilde, P.J. & Morris, V.J. Orogenic displacement of protein from the air/water interface by competitive adsorption. J Colloid Interf Sci 210, 157-166 (1999). Angenendt, P., Glokler, J., Sobek, J., Lehrach, H. & Cahill, D.J. Next generation of protein microarray support materials: evaluation for protein and antibody microarray applications. J Chromatogr A 1009, 97-104 (2003). Danczyk, R., Krieder, B., North, A., Webster, T., HogenEsch, H. & Rundell, A. Comparison of antibody functionality using different immobilization methods. Biotechnol Bioeng 84, 215223 (2003). Sauer, U. Impact of substrates for probe immobilization. Methods Mol Biol 785, 363-378 (2011). Derwinska, K., Gheber, L.A. & Preininger, C. A comparative analysis of polyurethane hydrogel for immobilization of IgG on chips. Anal Chim Acta 592, 132-138 (2007). Harbers, G.M., Emoto, K., Greef, C., Metzger, S.W., Woodward, H.N., Mascali, J.J., Grainger, D.W. & Lochhead, M.J. A functionalized poly(ethylene glycol)-based bioassay surface chemistry that facilitates bio-immobilization and inhibits non-specific protein, bacterial, and mammalian cell adhesion. Chemistry of materials : a publication of the American Chemical Society 19, 4405-4414 (2007). Sobek, J., Aquino, C., Weigel, W. & Schlapbach, R. Drop drying on surfaces determines chemical reactivity - the specific case of immobilization of oligonucleotides on microarrays. BMC Biophys 6, 8 (2013). Ray, S., Mehta, G. & Srivastava, S. Label-free detection techniques for protein microarrays: prospects, merits and challenges. Proteomics 10, 731-748 (2010). Forster, T., Costa, Y., Roy, D., Cooke, H.J. & Maratou, K. Triple-target microarray experiments: a novel experimental strategy. BMC Genomics 5, 13 (2004). Fare, T.L., Coffey, E.M., Dai, H., He, Y.D., Kessler, D.A., Kilian, K.A., Koch, J.E., LeProust, E., Marton, M.J., Meyer, M.R., Stoughton, R.B., Tokiwa, G.Y. & Wang, Y. Effects of atmospheric ozone on microarray data quality. Anal Chem 75, 4672-4675 (2003). Cox, W.G., Beaudet, M.P., Agnew, J.Y. & Ruth, J.L. Possible sources of dye-related signal correlation bias in two-color DNA microarray assays. Anal Biochem 331, 243-254 (2004). Staal, Y.C., van Herwijnen, M.H., van Schooten, F.J. & van Delft, J.H. Application of four dyes in gene expression analyses by microarrays. BMC Genomics 6, 101 (2005). Berlier, J.E., Rothe, A., Buller, G., Bradford, J., Gray, D.R., Filanoski, B.J., Telford, W.G., Yue, S., Liu, J., Cheung, C.Y., Chang, W., Hirsch, J.D., Beechem, J.M., Haugland, R.P. & Haugland, R.P. Quantitative comparison of long-wavelength Alexa Fluor dyes to Cy dyes: fluorescence of the dyes and their bioconjugates. J Histochem Cytochem 51, 1699-1712 (2003). Chan, W.C., Maxwell, D.J., Gao, X., Bailey, R.E., Han, M. & Nie, S. Luminescent quantum dots for multiplexed biological detection and imaging. Curr Opin Biotechnol 13, 40-46 (2002). Espina, V., Woodhouse, E.C., Wulfkuhle, J., Asmussen, H.D., Petricoin, E.F., 3rd & Liotta, L.A. Protein microarray detection strategies: focus on direct detection technologies. J Immunol Methods 290, 121-133 (2004). Schweitzer, B., Wiltshire, S., Lambert, J., O'Malley, S., Kukanskis, K., Zhu, Z., Kingsmore, S.F., Lizardi, P.M. & Ward, D.C. Immunoassays with rolling circle DNA amplification: a versatile platform for ultrasensitive antigen detection. Proc Natl Acad Sci U S A 97, 10113-10119 (2000). Schweitzer, B., Roberts, S., Grimwade, B., Shao, W., Wang, M., Fu, Q., Shu, Q., Laroche, I., Zhou, Z., Tchernev, V.T., Christiansen, J., Velleca, M. & Kingsmore, S.F. Multiplexed protein profiling on microarrays by rolling-circle amplification. Nat Biotechnol 20, 359-365 (2002). Sang, S., Wang, Y., Feng, Q., Wei, Y., Ji, J. & Zhang, W. Progress of new label-free techniques for biosensors: a review. Critical reviews in biotechnology, 1-17 (2015).
References
135. 136. 137.
138.
139. 140.
141.
142.
143.
144. 145.
146.
147.
Scarano, S., Mascini, M., Turner, A.P. & Minunni, M. Surface plasmon resonance imaging for affinity-based biosensors. Biosens Bioelectron 25, 957-966 (2010). Scherl, A. Clinical protein mass spectrometry. Methods. Boozer, C., Kim, G., Cong, S., Guan, H. & Londergan, T. Looking towards label-free biomolecular interaction analysis in a high-throughput format: a review of new surface plasmon resonance technologies. Current Opinion in Biotechnology 17, 400-405 (2006). Frederix, F., Bonroy, K., Reekmans, G., Laureyn, W., Campitelli, A., Abramov, M.A., Dehaen, W. & Maes, G. Reduced nonspecific adsorption on covalently immobilized protein surfaces using poly(ethylene oxide) containing blocking agents. Journal of biochemical and biophysical methods 58, 67-74 (2004). MacBeath, G. & Schreiber, S.L. Printing proteins as microarrays for high-throughput function determination. Science 289, 1760-1763 (2000). Walter, J.-G., Stahl, F., Reck, M., Praulich, I., Nataf, Y., Hollas, M., Pflanz, K., Melzner, D., Shoham, Y. & Scheper, T. Protein microarrays: Reduced autofluorescence and improved LOD. Engineering in Life Sciences 10, 103-108 (2010). Sill, M., Schroder, C., Hoheisel, J.D., Benner, A. & Zucknick, M. Assessment and optimisation of normalisation methods for dual-colour antibody microarrays. BMC Bioinformatics 11, 556 (2010). Eckel-Passow, J.E., Hoering, A., Therneau, T.M. & Ghobrial, I. Experimental design and analysis of antibody microarrays: applying methods from cDNA arrays. Cancer Res 65, 29852989 (2005). Olle, E.W., Sreekumar, A., Warner, R.L., McClintock, S.D., Chinnaiyan, A.M., Bleavins, M.R., Anderson, T.D. & Johnson, K.J. Development of an internally controlled antibody microarray. Mol Cell Proteomics 4, 1664-1672 (2005). Zhang, L., Wei, Q., Mao, L., Liu, W., Mills, G.B. & Coombes, K. Serial dilution curve: a new method for analysis of reverse phase protein array data. Bioinformatics 25, 650-654 (2009). Anderson, T., Wulfkuhle, J., Liotta, L., Winslow, R.L. & Petricoin, E., 3rd Improved reproducibility of reverse-phase protein microarrays using array microenvironment normalization. Proteomics 9, 5562-5566 (2009). Kaushik, P., Molinelli, E.J., Miller, M.L., Wang, W., Korkut, A., Liu, W., Ju, Z., Lu, Y., Mills, G. & Sander, C. Spatial normalization of reverse phase protein array data. PLoS One 9, e97213 (2014). Structural Genomics, C., China Structural Genomics, C., Northeast Structural Genomics, C., Graslund, S., Nordlund, P., Weigelt, J., Hallberg, B.M., Bray, J., Gileadi, O., Knapp, S., Oppermann, U., Arrowsmith, C., Hui, R., Ming, J., dhe-Paganon, S., Park, H.W., Savchenko, A., Yee, A., Edwards, A., Vincentelli, R., Cambillau, C., Kim, R., Kim, S.H., Rao, Z., Shi, Y., Terwilliger, T.C., Kim, C.Y., Hung, L.W., Waldo, G.S., Peleg, Y., Albeck, S., Unger, T., Dym, O., Prilusky, J., Sussman, J.L., Stevens, R.C., Lesley, S.A., Wilson, I.A., Joachimiak, A., Collart, F., Dementieva, I., Donnelly, M.I., Eschenfeldt, W.H., Kim, Y., Stols, L., Wu, R., Zhou, M., Burley, S.K., Emtage, J.S., Sauder, J.M., Thompson, D., Bain, K., Luz, J., Gheyi, T., Zhang, F., Atwell, S., Almo, S.C., Bonanno, J.B., Fiser, A., Swaminathan, S., Studier, F.W., Chance, M.R., Sali, A., Acton, T.B., Xiao, R., Zhao, L., Ma, L.C., Hunt, J.F., Tong, L., Cunningham, K., Inouye, M., Anderson, S., Janjua, H., Shastry, R., Ho, C.K., Wang, D., Wang, H., Jiang, M., Montelione, G.T., Stuart, D.I., Owens, R.J., Daenke, S., Schutz, A., Heinemann, U., Yokoyama, S., Bussow, K. & Gunsalus, K.C. Protein production and purification. Nat Methods 5, 135-146 (2008).
63