Ma `lumot

Astar dizayni va BLAST E qiymatining qat'iyligi

Astar dizayni va BLAST E qiymatining qat'iyligi



We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Noto'g'ri talqinlarni qidirayotganda, menga 0,01 dan yuqori bo'lgan e qiymati to'g'ri ekanligini va hech qanday noto'g'ri talqin qilinmasligini aytishdi. Shunga qaramay, men ba'zilarini qidirib topdim va elektron qiymatning chegara chegarasi ma'lumotlar bazasidagi "ma'lumot miqdori" ga bog'liq bo'lib tuyuladi. Darhaqiqat, ba'zi hujjatlar 0,07 dan past bo'lgan qiymatlarni allaqachon ahamiyatli deb ko'rsatgan.

Vaqti -vaqti bilan bu masalani ko'rib chiqish uchun qanday mezonlardan foydalanasiz va astarlarni loyihalash maqsadida muntazam BLAST ishlayotganda, qat'iyatlilikni qachon kamaytirish yoki oshirish kerakligini qanday bilsam bo'ladi?


Men hech qachon BLASTni astar dizayni uchun ishlatmaganman, lekin ko'pincha ketma -ketlik qayerdan kelganini bilish uchun.

NCBI tez-tez so'raladigan savollardan: http://www.ncbi.nlm.nih.gov/blast/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=FAQ

"Elektron qiymat qanchalik past bo'lsa yoki nolga yaqinlashsa, moslik shunchalik" ahamiyatli "bo'ladi. Ammo shuni yodda tutingki, deyarli bir xil qisqa hizalamalar E qiymatiga ega. Bu E qiymatining hisoblanishi bilan bog'liq. so'rovlar ketma -ketligi uzunligini hisobga oladi. Bu yuqori E qiymatlari mantiqiy, chunki qisqa ketma -ketliklar ma'lumotlar bazasida tasodifan paydo bo'lish ehtimoli yuqori. Batafsil ma'lumot uchun BLAST kursidagi hisob -kitoblarga qarang. "

Hujjatlarda aytganlari rost, sizning ma'lumotlar bazangizning o'lchami o'yin qanchalik muhimligini hal qilishda muhim ahamiyatga ega. Men tasavvur qila olamanki, siz boshlang'ich dizayni uchun siz E-qadriyatlarga ko'proq ehtiyot bo'lishni xohlaysiz, chunki ular ko'pincha "pastroq ketma-ketliklar ma'lumotlar bazasida tasodifan paydo bo'lish ehtimoli yuqori".

Umid qilamanki, bu sizning chegarangizni aniqlash bo'yicha qaror qabul qilishingizga yordam beradi, aks holda qo'shimcha ma'lumot bering.


Ushbu NCBI sahifasi BLAST orqali o'ziga xoslikni tekshirishni o'z ichiga olgan asosiy dizayn vositasini taklif qiladi. Biroq, bu qat'iylikni e qiymatidan ko'ra mos kelmaslik darajasiga asoslanadi.


Astar dizayni va BLAST E qiymatining qat'iyligi - Biologiya

Primer-BLAST Primer3 yordamida kirish PCR shabloniga xos bo'lgan primerlarni yaratish uchun mo'ljallangan. Shuningdek, u foydalanuvchi tomonidan taqdim etilgan primerlarning o'ziga xosligini tekshirishi mumkin.

"Qisqa, deyarli aniq mos keladiganlarni qidirish" nukleotid va oqsil sahifalari endi yo'q. Buning o'rniga, nukleotid va oqsil portlash dasturlari avtomatik tarzda qisqa so'rovlarni tekshiradi va shunga mos ravishda qidiruv parametrlarini moslashtiradi. Nukleotid yoki aminokislota so'rovi uzunligi 30 yoki undan kam bo'lsa, bu sozlash sodir bo'ladi. Tarjima qilingan portlash dasturlari yoki genom portlash sahifalarida qidiruvlarda bunday avtomatik sozlash xususiyati yo'q.

Savol: Nukleotid-nukleotidlarni qidirish uchun standart ma'lumotlar bazasi

Savol: Qidiruv parametrlarini saqlash

Savol: Qanday qilib qidiruvni organizm yoki taksonomik guruh bilan cheklash yoki bunday guruhlarni chiqarib tashlash mumkin

Organizm yoki taksonomik guruhdan faqat ketma -ketlikni qidirish uchun "Organizm" matn maydonidan foydalaning. Nukleotid portlash sahifalarida avval "Boshqalar (nr va boshqalar)" uchun radio tugmasini bosing. "Organizm" matn qutisi avtomatik to'ldirish funktsiyasiga ega. Organizmning umumiy nomini (kalamush, bakteriyalar va boshqalar), jins yoki turni (elegans, danio va boshqalar) yoki NCBI taksonomiyasi identifikatorini kiritishni boshlang, so'ngra ro'yxatdan nom tanlang.

Taksonomik guruhni "Organizm" katagining o'ng tomonidagi "Chiqarish" katakchasi yordamida ham chiqarib tashlash mumkin.

"Organizm" matn qutisining o'ng tomonidagi "+" qutisiga qo'shimcha taksonomik guruhlarni kiritish yoki chiqarib tashlash mumkin.

Siz avvalgidek Entrez Query shartlaridan ham foydalanishingiz mumkin. Ularni "Organizm" maydonining ostidagi "Entrez so'rovi" maydoniga qo'ying, masalan, rattus norvegicus [organizm] yoki oddiygina, rat [orgn]. Shuningdek, “Qidiruvni maʼlumotlar bazasi ketma-ketliklari toʻplamiga qanday cheklash mumkin” boʻlimiga qarang.

Taksonomiya brauzerida taksilarni qidirishingiz mumkin.

Savol: Qanday qilib modellarni (XM/XP ulanishlari) va madaniyatsiz atrof -muhit ketma -ketligini istisno qilish mumkin?

Savol: Qidiruvni ma'lumotlar bazasi ketma -ketligi bilan qanday cheklash mumkin?

Savol: BLAST yordamida ketma-ketliklar to‘plamini qanday qidirishim mumkin?

    1.) BLAST mustaqil bajariladigan fayllari. Bu mahalliy, NCBI BLAST ma'lumotlar bazalarining yuklangan nusxalari yoki BLAST uchun formatlangan maxsus ma'lumotlar bazalariga qarshi BLAST qidiruvlarini bajaradigan buyruq qatori dasturlari. Dasturlar bir nechta FASTA so'rovlar ketma -ketligi bo'lgan bitta katta faylni boshqaradi yoki siz bir vaqtning o'zida bir nechta fayllarni yuborish uchun skript yaratishingiz mumkin. Bajariladigan fayllar LINUX, Windows va Mac OSX kabi turli xil platformalarda mavjud.

Savol: Ma'lumotlar bazasini qidirmasdan ikkita ketma -ketlikni tekislash uchun BLAST -dan qanday foydalanish kerak.

Savol: Expect (E) qiymati nima?

Kutish qiymati (E) ma'lum o'lchamdagi ma'lumotlar bazasini qidirishda tasodifan ko'rishni "kutish" mumkin bo'lgan xitlar sonini tavsiflovchi parametrdir. Uchrashuv ballari (S) oshgani sayin u eksponent sifatida kamayadi. Asosan, E qiymati tasodifiy fon shovqinini tavsiflaydi. Misol uchun, zarbaga tayinlangan 1 E qiymatini joriy o'lchamdagi ma'lumotlar bazasida tasodifan shunga o'xshash ballga ega 1 ta o'yinni ko'rishni kutish mumkin, deb talqin qilinishi mumkin.

E-qiymati qanchalik past bo'lsa yoki u nolga qanchalik yaqin bo'lsa, o'yin shunchalik "muhim" bo'ladi. Ammo shuni yodda tutingki, deyarli bir xil qisqa hizalamalar nisbatan yuqori E qiymatlariga ega. Chunki E qiymatini hisoblashda so'rovlar ketma -ketligi uzunligi hisobga olinadi. Bu yuqori E qiymatlari mantiqiy, chunki qisqa ketma -ketliklar ma'lumotlar bazasida tasodifan paydo bo'lish ehtimoli yuqori. Batafsil ma'lumotni BLAST kursidagi hisob -kitoblarga qarang.

Kutish qiymati, shuningdek, natijalar haqida hisobot berish uchun muhimlik chegarasini yaratishning qulay usuli sifatida ishlatilishi mumkin. Ko'p BLAST qidiruv sahifalarida siz kutish qiymatining chegarasini o'zgartirishingiz mumkin. Kutish qiymati sukut bo'yicha 10 dan oshirilsa, ko'proq past ballga ega bo'lgan kattaroq ro'yxat haqida xabar berilishi mumkin.

"Past murakkablik" ketma-ketligi nima?

Murakkabligi past bo'lgan hududlar g'ayrioddiy tarkibga ega, ular ketma-ketlikdagi o'xshashlikni qidirishda muammolar tug'dirishi mumkin. Aminokislotalar bo'yicha so'rovlar uchun bu kompozitsion moyillik SEG dasturi bilan belgilanadi (Wootton va Federhen, 1996). Nukleotid so'rovlari uchun u DustMasker dasturi tomonidan aniqlanadi (Morgulis, va boshq., 2006).

Kam murakkablik ketma-ketligini ko'pincha vizual tekshirish orqali aniqlash mumkin. Masalan, PPCDPPPPPKDKKKKDDGPP oqsil ketma-ketligi past murakkablikka ega va AAATAAAAAAAATAAAAAAT nukleotidlar ketma-ketligi ham shunday. Filtrlar past murakkablikdagi ketma-ketlikni olib tashlash uchun ishlatiladi, chunki u artefaktli xitlarga olib kelishi mumkin.

Filtrsiz amalga oshiriladigan BLAST qidiruvlarida yuqori ball to'planganligi haqida faqat murakkabligi past bo'lgan hududlar haqida xabar berish mumkin. Ko'pincha, bu turdagi o'yinlarni umumiy gomologiyaning natijasi deb hisoblash o'rinli emas. To'g'rirog'i, bu past murakkablikdagi hudud "yopishqoq" va haqiqatan ham bog'liq bo'lmagan ko'plab ketma-ketliklarni tortib olayotgandek.

(Organizmga xos) o'zaro takroriy takrorlashni qanday filtrlash mumkin?


Primerlar nima?

Primerlar tanamizda ham, ilmiy tajribalarda ham DNK sintezi uchun oddiy, ammo asosiy tarkibiy qismlardir. Primerlarni oligonukleotidlar deb ham atash mumkin va ular tom ma'noda bitta torli nukleotidlarning kichik qismlari bo'lib, uzunligi taxminan 5 – 22 tayanch juftlikdan iborat. Primerlarning asosiy xususiyati shundaki, ular DNK shablon zanjirini to'ldiruvchi bo'lib, DNK polimeraza bilan bog'lanishi va DNK sintezini boshlashi uchun ipni "primerlash" uchun xizmat qiladi.

Qanday turdagi primerlar mavjud? RNK va DNK primerlari

Tirik organizmlar faqat RNK primerlaridan foydalanadilar, laboratoriyada ishlatiladigan primerlar odatda DNK primerlaridir. Olimlar turli sabablarga ko'ra RNK primerlari o'rniga DNK primerlaridan foydalanadilar. DNK primerlari ancha barqaror va saqlash osonroq va ular sintezni boshlash uchun qiyinroq bo'lgan fermentlarni talab qiladi (1-rasmga qarang).

DNK primerlari RNK primerlari
Foydalanish In vitro: PCR amplifikatsiyasi, DNK ketma-ketligi, klonlash va boshqalar In vivo: DNK replikatsiyasi
Reaksiya Amplifikatsiya haroratga bog'liq bo'lib, kamroq oqsillarni talab qiladi Replikatsiya fermentga bog'liq katalitik reaktsiya bo'lib, bir nechta oqsillarni talab qiladi
Uzunlik 18 – 24 ta asosiy juftlik 10 – 20 ta asosiy juftlik
Yaratilish Olimlar tomonidan kimyoviy sintez qilingan Primaza (RNK polimeraza turi)
Yashash qobiliyati Umri uzoqroq, barqarorroq Qisqa umr, ko'proq reaktiv

DNK yoki RNK primerlarining shablon zanjiriga bog'lanishi DNK sintezi uchun mas'ul bo'lgan DNK polimeraza fermentini ishga tushiradi, bu esa nuklein kislotaning reaktiv 3'-gidroksil uchiga ("3 asosiy uchi" deb ataladi) nukleotidlarni qo'shishni boshlaydi. primer, ota -ipni uzaytiruvchi va takrorlovchi.


Natijalar

ThermoAlign quvur liniyasi sxemasi 1-rasmda ko'rsatilgan. Keyingi bo'limlarda asbobning har bir moduliga tegishli natijalar keltirilgan. Quvurni namoyish qilish va ThermoAlign xususiyatlarini ta'kidlash uchun makkajo'xori genomining 24 kb maqsadli hududi (B73 RefGen_v3 Chr3:33490673..33514673) ishlatilgan. Ushbu hududning 66 foizi genom yig'ilishida takroriy niqoblangan deb izohlanadi. Nopok bo'lmagan ketma-ketlik va primerni bog'lash bilan bog'liq takrorlanishni o'rganib, bu hudud uchun mo'ljallangan astarlarning 72% ini ma'lum bir astar uchun 1 dan 215 tagacha nishonlanmagan astarlanish hodisalarini ishlab chiqarishi taxmin qilinadi (2-rasm). Xuddi shu mintaqa, genomning boshqa segmentlari bilan bir qatorda, ThermoAlign tomonidan ishlab chiqilgan primerlarning kuchayish xususiyatini sinab ko'rish uchun ishlatilgan.

Quvur liniyasining barcha komponentlari tomonidan bitta ishlaydigan parametrlar fayli ishlatiladi. Rangli qutilar ThermoAlign -ning to'rtta yadroli modullarini ifodalaydi, ular ish tartibida sanab o'tilgan: (1) maqsadli hududni tanlash, (2) noyob oligonukleotid dizayni, (3) primerning o'ziga xosligini baholash va (4) primer juftligini tanlash. Kesilgan qutilar ushbu modullarning har birida kichik tartiblarni ifodalaydi va o'qlar ularning ishlash tartibini tasvirlaydi. Qolgan elementlar - ma'lumotlar bazasi (ma'lumotnomali genom ketma -ketligi), tashqi fayllar (variantli qo'ng'iroq formati [.vcf] fayllari va ishga tushirish parametrlari fayli) va funktsiyalari (eng yaqin qo'shni modeli Tm Primer3 da homodimer, heterodimer va soch tolasi o'zaro ta'sir qilish funktsiyalari). Qolgan komponentlarning ulanish liniyalari ulangan komponentlarga bog'liqliklarni tasvirlaydi (to'ldirilgan nuqta ma'lumot yoki funksiya olinadigan manbani ko'rsatish uchun ishlatiladi). ThermoAlign uchun kerakli yozuvlar yulduzcha bilan ko'rsatilgan.

Rasm ortiqcha ipning har bir 25 bp ketma-ketligini (26 bp surma oynasi) tahliliga asoslangan. Barcha kichik shakllar uchun qizil chiziqlar nishondan tashqari termo-hizalamalar sonini ko'rsatadi Tm mos keladigan nishondan 10 ° C atrofida Tm. Sariq chiziqlar (qizil rang bilan to'qnashganda to'q sariq rang), ma'lum bir primer va ≥70 % identifikatsiyali (pid) bo'lmagan maqsadli joylar orasidagi issiqlik moslamalari sonini ko'rsatadi. Moviy chiziqlar GC tarkibining foizini ko'rsatadi. Maqsadga mos bo'lmagan saytlarni qidirish ushbu tadqiqotda o'ziga xoslikni baholash uchun ishlatilgan BLASTn sozlamalariga asoslangan edi (qarang. Usullar), bunda psevdomolekulada maksimal 20 ta potentsial sayt yoki jami 260 ta mumkin bo'lgan sayt bor edi. (a) Takrorlashlar sonining va GC tarkibining foizini jami taqsimlash. (b) Takroriy tarkibning genomik taqsimoti va GC foizi. Har 25 sekundlik ketma-ketlikning 5'-nukleotidining psevdomolekula koordinatasi chizilgan ma'lumotlarni joylashtirish uchun ishlatilgan. X o'qidagi qora gorizontal chiziqlar bu mintaqadagi ikkita genni ko'rsatadi [chapda: GRMZM2G031364 o'ngda: GRMZM2G031239]. Mintaqadagi 25-mers orasida ≈73% da noto'g'ri ish bo'lishi taxmin qilinadi Tm primerdan 10 ° C atrofida Tm. (c) CIRCOS syujeti mintaqadagi bitta primerdan iborat bo'lib, u genom bo'ylab noto'g'ri yozilgan saytlarning eng ko'p soniga (n = 215) ega. CIRCOS uchastkasining qizil chiziqlari 1 dan 10 gacha xromosomalar, mitoxondriya (Mt), plastid (Pt) va xaritalanmagan ketma -ketliklar (ochilmagan) uchun psevdomolekulalarda bashorat qilingan noto'g'ri joylarni bog'laydi.

Maqsadli hududni tanlash (TRS)

ThermoAlign ishdan olingan umumiy ma'lumotlarni o'z ichiga olgan chiqish faylini ishlab chiqaradi (masalan, S1 qo'shimcha fayli). 24 kblik maqsadli hududning natijasi shuni ko'rsatdiki, unda mos yozuvlar ketma -ketligi yig'ilishida bo'shliqlar yo'q, 1073 ta SNP, 93 indel va 46% GC tarkibi.

Noyob Oligo Dizayn (UOD)

UOD algoritmi PCR uchun qulay deb topilgan va genomning boshqa joylarida o'xshash bo'lmagan har bir primerni (primer juftlarni emas) aniqlash uchun mo'ljallangan. 24 kb maqsadli mintaqa uchun, 184.145 ta mumkin bo'lgan astarlar orasida, 82.520 makkajo'xori HapMap3 38 polimorfizmini o'z ichiga olgan joylarda sodir bo'lmadi. Qolgan UOD filtrlarining to'liq to'plamini qo'llash (Sozlamalar uchun Qo'shimcha fayl S2 ga qarang) natijada 877 nomzod primerlari tanlandi.

82 520 primerning UOD filtrlash toifalariga tasnifi qaysi xususiyatlar primerlarni olib tashlashga eng katta ta'sir ko'rsatganini ko'rish uchun tekshirildi. Bu primer ketma-ketligi xususiyatlari uchun filtrlardan boshlab va primer o'zaro ta'siri uchun filtrlar bilan tugaydigan ikki qismga bo'lingan (Qo'shimcha rasm. S1). Ketma-ketlik xususiyatlari bo'yicha 75 073 primer filtrlangan. Faqat bitta ketma-ketlik toifasi bilan bog'liq bo'lgan primerlarni hisobga olgan holda, A/T-end filtri eng ko'p sonli primerlarni olib tashladi (n = 9,217), bu faqat bitta xususiyatga xos bo'lgan primerlarning umumiy to'plamining ≈50% ni tashkil qiladi (Qo'shimcha S1a -rasm). A/T-end xususiyati samarasiz astarlanish potentsiali yuqori bo'lgan primerlarni yo'q qilish uchun foydali evristikdir 39. Majburiy emas, A/T-end filtri yoki boshqa filtrlar nomzod primerlarning yuqoriroq kashfiyot tezligiga erishish uchun chiqarib tashlanishi yoki qayta parametrlanishi mumkin, ammo bu primer o'ziga xosligini baholash uchun zarur bo'lgan hisoblash vaqtini ko'paytirish hisobiga keladi (PSE keyingi bo'lim) . Masalan, A/T-end filtrini UOD dan chiqarib tashlasak, 1161 ta qoʻshimcha nomzod primer paydo boʻldi (qoʻllanilgan A/T-end filtri bilan aniqlangan 877 taga nisbatan), lekin bu PSE uchun ish vaqti soniyalarida taxminan toʻrt baravar koʻproq vaqt talab qildi.

Filtrdan keyin ketma-ketlik xususiyatlariga qarab qolgan 7447 ta primerga qo'llaniladigan primer o'zaro ta'sir filtrlari, homomoder genomidagi nishondan tashqari joyda aniq mos kelishni o'z ichiga oladi. Tm, heterodimer Tm va soch qisqichi Tm 40 (S1b qo'shimcha rasm). Buning natijasida qo'shimcha 6570 ta primer filtrlangan bo'lib, 433 ta oldinga va 444 ta teskari primer, 136 tasi ikkita ipda bir xil holatda bo'lgan.

Tayyorlashning o'ziga xosligini baholash (PSE)

ThermoAlign-ning muhim jihati-maqsadsiz hibridizatsiya joylarini tavsiflash uchun ishlatiladigan algoritmik va miqdoriy yondashuv. Noto'g'ri yozilish potentsialini aniqlash algoritmining bir qismi sifatida, har bir nishonga mos kelmaydigan o'yin uchun BLASTn hizalanishi, termal hizalanmalarga (to'liq uzunlikdagi, ochilmagan primer-shablon hizalamalari) tahrir qilinadi, bu esa ma'lumotlarni to'g'ri va aniq baholash imkonini beradi. Tm primer uchun hisoblanishi kerak (3-rasm). ≥70% ketma -ketlik identifikatsiyasiga ega mahalliy BLASTn moslamalari (asosan kesilgan mahalliy moslamalar) o'rtacha ko'rsatkichga ega edi. Tm bu ularning termoalohidasidan 7 °C yuqori edi (3b-rasm). Biroq, Tm BLASTn hizalanmalarining 10,8% (n = 18,834) uchun ularning termal hizalanishidan past bo'lgan (3b -rasm). O'rtasidagi farq diapazoni Tm BLASTn moslamalari mos keladigan termo -hizalamalar bilan solishtirganda -14 ° C dan 272 ° S gacha bo'lgan. nomuvofiqliklar soni o'rtasidagi munosabatni hisobga olgan holda va Tm, Shakl 3c,d ko'rsatdiki, nomuvofiqliklar soni, garchi issiqlik moslamasi bilan bog'liq. Tm, noto'g'ri ishga tushirish ehtimoli uchun mos proksi emas. Hatto bir nechta nomuvofiqliklar mavjud bo'lsa ham Tm maqsaddan tashqari joylarda bog'lash uchun PCR uchun xos bo'lgan haroratlarda bo'lishi mumkin (masalan, >60 °C 3c-rasm). Bundan tashqari, maqsaddan tashqari Tm har doim nishondan etarlicha uzoq bo'lmasligi mumkin Tm maxsus astarlanish uchun (3 -rasm). 3d-rasmdagi ma'lumotlar uchun termoalignmentlarning ≈80% maqsadli bo'lgan. Tm > 10 °C maqsaddan tashqari Tm.

(a.1) To'liq uzunlikdagi astar ketma-ketligiga misollar. (a.2) Maqsaddan tashqari ikkita ketma-ketlik (pastki chiziq) uchun yuqori darajadagi BLASTn yuqori baholi segmentlar juftligi (HSP) hizalamasi qayta ishlanadi. (a.3) original BLASTn HSP hizalamasini oxirigacha to'ldirish (bo'shliqsiz BLASTn) yoki bo'shliqlarni olib tashlash va oxirini to'ldirish (bo'shliqli BLASTn) orqali termoalignlash. (b) Matnda tasvirlangan 24 kb hudud uchun UOD moduli tomonidan chiqarilgan 877 nomzod primer uchun Tm har bir yuqori darajadagi BLASTn HSP hizalanishi va unga mos keladigan termo-hizalanish uchun hisoblab chiqilgan. (c) Tugallanmagan BLASTn HSP-laridan (n = 169,404 hizalamalar) hosil qilingan termo-hizalamalar to'plamidan foydalanib, syujet maqsaddan tashqari aloqani ko'rsatadi. Tm mos kelmasliklarning umumiy soniga nisbatan termo -hizalamalar uchun. (d) Xuddi shu ma'lumotlar to'plamidan foydalanish (c) syujet maqsadli orasidagi farqni ko'rsatadi Tm va nishondan tashqari Tm mos kelmasliklarning umumiy soniga nisbatan termo -hizalamalar.

Primer juftlikni tanlash (PPS)

Mutaxassis genom doirasida barqaror gibridlanishi va aniq maqsadga muvofiq boʻlishi kutilgan 877 ta oligonükleoltiddan 2818 ta primer juft kombinatsiyasi standart PCR uchun mos ekanligi aniqlandi. PPS (Qo'shimcha fayl S2) uchun ishlatiladigan parametr sozlamalari +10 °C farq talabini o'z ichiga oladi. Tm pastki qismi bilan primer o'rtasida Tm berilgan juftlik va eng katta maqsaddan tashqari Tm ikkita astar uchun. Ushbu chegarani pasaytirish primerlarni aniqlash tezligini oshirishi mumkin, ammo haqiqiy PCRda maqsaddan tashqari amplikonlar paydo bo'lishi mumkin bo'lgan pastki chegarani hisobga olish kerak. +6 ° C ga o'rnatilganda, 24 kb hudud uchun PPS moduli tomonidan tanlangan primer juftliklar soni 4189 taga oshdi. Ushbu chegarani yuqori chegara bilan birga sozlash Tm UOD uchun ishlatiladigan diapazon kashfiyot tezligini oshirishi mumkin. ni oshirish Tm +5 ° C gacha o'zgarishi (64-74 ° C dan 62-77 ° C gacha) +10 ° C maksimal noto'g'ri vaqt farqini saqlab turganda, UOD → PSE → PPS quvur liniyasi orqali 4103 ta primer juftligini aniqlashga olib keldi.

Yuqoridan yuqoridagi 877 ta primer bilan, maqsadli hudud uchun maksimal qamrov miqdorini ta'minlaydigan, astar juftlarining minimal sonini (eng qisqa yo'l) aniqlash uchun yo'naltirilgan grafik usuli ishlatilgan. Amplikon o'lchamlari diapazoni sozlanishi bu erda ko'rib chiqiladigan mintaqa uchun mumkin bo'lgan qamrov miqdorining hal qiluvchi omili edi (Qo'shimcha jadval S2). Kichikroq amplikon o'lchami diapazonlari nisbatan past qamrovga olib keldi va eng katta o'lcham diapazonlari (≥15 kb) qoplamaga olib keldi. Maksimal qamrov 5 dan 15 kb gacha bo'lgan amplikon o'lchamlari uchun erishildi. Biroq, A/T-end filtri mingdan ortiq primerlarning yo'qolishiga olib kelganini eslab, bu filtrni hisobga olmaganda, kutilgan qamrovni maksimal 61,8% dan (filtr bilan) 88,7% gacha (filtrsiz) oshirdi.

Tayyorlashning o'ziga xosligini empirik baholash

ThermoAlign tomonidan ishlab chiqarilgan primer juftliklar standart PCR va uzoq masofali PCR uchun standartlashtirilgan shartlar yordamida sinovdan o'tkazildi ("Uslublar" bo'limiga qarang). Standart PCR uchun makkajo'xori oltita xromosomasida joylashgan ettita gen bilan bog'langan 46 ta primer juft sinovdan o'tkazildi (Qo'shimcha fayl S3). PPS -da yo'naltirilgan grafik tahlil usulini qo'llagan holda, bu primer juftliklar har bir genning yuqori oqimidan 1 kb dan pastgacha 1 kb gacha plitka qo'yish uchun mo'ljallangan. Bu o'ttiz sakkizta juft juftlik amplikon ishlab chiqargan va ularning har biri uchun kutilgan kattalikdagi bitta o'ziga xos amplikon kuzatilgan. 4a 46 primer juftidan 29 tasi uchun natijalarni ko'rsatadi, ulardan ikkitasi kuchaytirilmagan (6:7,048,348 va 7:128,406,874)].

Standart PCR bilan kuchaytirilgan, ammo uzoq masofali PCR (matnda tasvirlanganidek) bo'lmagan ikkita qo'shimcha genlardan olingan mahsulotlar ko'rsatilmagan. Yorliqlar maqsadli joyning xromosoma sonini, boshlang'ich boshlang'ich joyini va mahsulotning kutilgan hajmini ko'rsatadi. Har bir primer bo'yicha batafsil ma'lumot S3 qo'shimcha faylida mavjud. (a) Standart PCR mahsulotlari PCRdan keyingi tozalashsiz miqdori aniqlandi va har bir quduqqa taxminan ≈7,5 ng yuklandi. Mahsulot bo'lmagan ikkita reaktsiya uchun o'rtacha yuklangan hajmga teng hajm ishlatilgan. Berilgan gen uchun har bir to'plamga mos keladigan primer juftlardan tashkil topgan multipleks reaktsiyalar o'sha to'plamga tegishli primerlar bilan birga yuklangan. (b) Uzoq masofali PCR mahsulotlari (-) va (+) betain bo'lmagan reaktsiyalardan. PCR mahsulotlarining miqdori PCRdan keyingi tozalanmasdan aniqlandi va har bir quduqqa 29 ng yuklandi. Mahsulot bo'lmagan uchta reaktsiya uchun mos keladigan betain reaktsiyasi uchun ishlatilgan bir xil hajm quduqqa yuklangan. Salbiy nazorat uchun, barcha reaktsiyalar orasida ishlatiladigan maksimal hajm quduqqa yuklangan. Salbiy nazorat asosiy aralashma, TA_1_25390617_27_F va TA_1_25395472_24_R (qo'shimcha fayl S3) primer juftidan iborat bo'lib, DNK shabloni yo'q. Orqa fonda bo'yalgan chiziqlar chiziqlar bo'ylab mahsulotning standart miqdoriga erishish uchun mahsulotning katta hajmini yuklashni talab qiladigan reaktsiyalar bilan bog'liq edi.

ThermoAlign MultiPLX 41 -ni birlashtiradi, kirish va chiqishni sozlashda, bir -birining ustiga yotqizilgan plitka yo'llarining kuchayishiga mos keladigan ikki guruhli multiplekslarni oladi. Standart PCR yordamida sinab ko'rilgan ettita maqsadli genlarning har biri uchun "oddiy" qat'iylik sozlamalari ostida MultiPLX ikkita primer juftdan ko'p bo'lmagan multiplekslarni aniqladi (beshta primer juftni birlashtirish imkoniyati mavjud edi). Multiplex PCR yordamida ishlab chiqarilgan amplikonlar, odatda, har bir primer jufti tomonidan ishlab chiqarilganiga mos keladi (bitta multipleks to'plamdagi bitta primer juftlik multipleks reaktsiyasida muvaffaqiyatsizlikka uchradi) va boshqa amplikonlar kuzatilmadi (4a -rasm).

Yuqorida aytib o'tilgan etti gendan beshtasi uchun 0,1-5,0 kb amplikon plitka yo'llari har bir gen uchun ishlab chiqilgan (standart PCR primerlari qo'shimcha fayl S3 dan mustaqil) va uzoq masofali PCR yordamida sinovdan o'tgan. Har bir gen uchun genning butun uzunligi bo'ylab yotqiziladigan ikkita primer juftlik aniqlandi (bitta istisno: ishlatilgan sozlamalardan tashqari, 3 -xromosomadagi P450 genini qamrab oladigan primer juftliklar topilmadi). Standart PCR singari, o'nta primer juftlik ham amplikon ishlab chiqarmagan, ammo ettitasi kutilgan kattalikdagi bitta taniqli amplikon ishlab chiqargan (4b -rasm). Kuchaytira olmagan yoki past rentabellikga ega bo'lgan uzoq masofali PCR amplikonlari uchun mahsulotlarni solishtirish uchun normalizatsiya qilish uchun reaksiya mahsulotining ko'p qismi jelga yuklangan. Bu salbiy nazoratdan kattaroq bo'lgan ba'zi bir fon smetasini ko'rsatdi, bu uzoq masofali PCR paytida (potentsial mega-primer kuchaytirilishi tufayli 14) tasodifiy maqsaddan tashqari kuchaytirilish sodir bo'lganligini ko'rsatdi.

Primer dizayn uchun mos yozuvlar genomining bog'liqligi va ba'zi standart PCR va uzoq masofali PCR reaktsiyalari amplikonlarni ishlab chiqara olmaganligi sababli, biz bu muvaffaqiyatsiz reaktsiyalar ketma-ket yig'ilishdagi noaniqliklar tufayli bo'lganmi, degan savolga javob berdik. Muayyan amplikonni ishlab chiqaradigan uzoq masofali PCR primer juftlari aniq yig'ilishni ko'rsatdi, degan taxminga ko'ra, bu savolni hal qilish uchun ushbu uzoq masofali PCR amplikonlari ichida joylashtirilgan standart PCR amplikonlarini ishlab chiqarish ishlatilgan.

Yigirma to'qqizta standart PCR primer juftlari uzoq masofali PCR tomonidan sinovdan o'tgan beshta genga mo'ljallangan bo'lib, ular kutilgan uzoq masofali PCR amplikonlaridan kamida bittasida joylashtirilgan. Standart PCR amplikonlarining ba'zilari ikkita uzoq masofali PCR amplikonlarining bir-biriga o'xshash bo'limlari ichiga joylashtirilgan, bu erda primer juftlaridan biri mahsulot ishlab chiqargan, ikkinchisi esa yo'q. Ushbu standart PCR primer juftlarini hisobga olinmagan holda, 21 standart PCR primer juftlaridan biri uzoq masofali PCR orqali amplikon ishlab chiqarilgan hududlarda amplikon ishlab chiqara olmadi. Aksincha, barcha beshta standart PCR primer juftlari uzoq masofali PCR yordamida amplikon ishlab chiqarilmagan hududlarda amplikon ishlab chiqargan. Standart va uzoq masofali PCR uchun muvaffaqiyatli va muvaffaqiyatsiz reaktsiyalar o'rtasidagi bog'liqlik ahamiyatli emas edi (Fisherning aniq testi, p = 1.0), bu PCR xatolarining sababi sifatida montaj xatolarini ko'rsata olmadi.

Primerlarning ketma-ket tarkibi yoki kuchaytirish maqsadi muvaffaqiyatga ta'sir qilish ehtimolini hisobga olgan holda 14 , reaktsiyalarga betain qo'shilishi barcha 10 ta uzoq masofali PCR primer juftlari kutilgan o'lchamdagi o'ziga xos mahsulotni ishlab chiqarishga olib keldi (4b-rasm). Betain bilan standart PCR primer juftliklarining keyingi sinovlari, betain yo'qligida muvaffaqiyatsiz bo'lgan ikkita ichki juftlik uchun bitta o'ziga xos amplikonni tiklashga olib keldi. yomon (ma'lumotlar ko'rsatilmagan). Qo'shimcha PCR optimallashtirish ushbu primer juftlarini kuchaytirish samaradorligini oshirishi mumkin. Uzoq masofali PCR uchun betain qo'shilishi natijasida olingan reaksiyalar amplikonlari GC tarkibining o'rtacha ko'rsatkichlari primerlar uchun 3,2 foiz punktga va kutilgan amplikonlar uchun 7,8 foiz punktlariga ega edi (B73 mos yozuvlar genomlari ketma-ketligi). Xuddi shunday, betain yordamida tiklangan standart PCR reaktsiyalari (barcha 46 ta primer juftligini hisobga olgan holda) primerlar uchun o'rtacha GC tarkibiga (3,7 foiz punkt) va kutilgan amplikonlarga (19,7 foiz punkt) ega edi.

Amplikonlarning maqsadli lokuslarga mos kelishini tasdiqlash uchun 4b-rasmdagi o'nta uzoq masofali PCR mahsulotlarining to'qqiztasi bitta molekula, real vaqtda ketma-ketlikda to'plangan. Primerga asoslangan klasterlash va ketma-ketlikni tahlil qilish yondashuvi kutilgan ketma-ketlikka to'liq mos keladigan to'qqizta konsensus ketma-ketligini yaratdi (1-jadval S4 qo'shimcha fayl).


Usul va amaliyot

Oli2go ning ish jarayoni 1-rasmda ko'rsatilgan. Quyidagi kichik bo'limlarda har bir bosqichning asosiy xususiyatlari batafsil tavsiflangan.

Oli2go dasturining umumiy ko'rinishi. (A) kiritish bilan boshlangan ish jarayonini tasvirlaydi n DNK ketma-ketliklari, keyin esa har bir kirish ketma-ketligi uchun mustaqil ravishda bajariladigan multipleks dizayni. Keyinchalik, multipleks dizaynida ishlab chiqarilgan barcha primerlar yordamida primer dimerni tekshirish amalga oshiriladi. Asosiy chiqishda FASTA formatidagi har bir kirish ketma -ketligi uchun primer va problar mavjud. (B) Multipleks probi va primerni loyihalash bosqichlari haqida batafsil ma'lumot beradi k-boshqa tanlovlar, T.m har bir kirish ketma -ketligi uchun hisob -kitoblar, soch qisqichlarini tekshirish, prob va astarning o'ziga xosligini tekshirish, shuningdek prob va astarni juftlashtirish. (C) Primer dimer tekshiruvini vizualizatsiya qiladi, bu erda oldingi multipleks dizayni natijasida barcha kirish ketma -ketligiga yo'naltirilgan barcha primerlar primer dimer shakllanishi uchun tekshiriladi.

Oli2go dasturining umumiy ko'rinishi. (A) Ning kiritilishi bilan boshlangan ish jarayonini tasvirlaydi n DNK ketma-ketliklari, keyin esa har bir kirish ketma-ketligi uchun mustaqil ravishda bajariladigan multipleks dizayni. Keyinchalik, primer dimer tekshiruvi multipleks dizaynda ishlab chiqarilgan barcha primerlar yordamida amalga oshiriladi. Asosiy chiqishda FASTA formatidagi har bir kirish ketma-ketligi uchun primerlar va zondlar mavjud. (B) Multipleks prob va primerni loyihalash bosqichlari haqida batafsilroq ma'lumot beradi k-boshqa tanlovlar, T.m har bir kirish ketma -ketligi uchun hisob -kitoblar, soch qisqichlarini tekshirish, prob va astarning o'ziga xosligini tekshirish, shuningdek prob va astarni juftlashtirish. (C) Astar dimerini tekshirishni ingl. Bunda oldingi multipleks dizaynidan kelib chiqadigan barcha kirish ketma-ketliklariga mo'ljallangan barcha primerlar primer dimer shakllanishi uchun tekshiriladi.

Kiritish

Veb-asoslangan oli2go vositasining bosh sahifasi kirish ketma-ketligini yuklash va dizayn parametrlarini belgilash uchun ishlatiladi. Ketma-ketliklar FASTA formatida yuklash yoki maxsus kirish qutisi yordamida taqdim etilishi kerak. Ma'lumotlar kamida ikkita ketma -ketlikni o'z ichiga olishi kerak, chunki oli2go multipleks reaktsiyalar uchun bir nechta ketma -ketlikni boshqarishga mo'ljallangan. Noaniq nukleotidlarni o'z ichiga olgan ketma -ketliklar qo'llab -quvvatlanadi, lekin ularni ehtiyotkorlik bilan ishlatish kerak, chunki ketma -ketlikdagi har bir o'zgaruvchi pozitsiya hisoblash bosqichlari sonini ko'paytiradi. Har bir mumkin bo'lgan o'zgaruvchan pozitsiya uchun o'ziga xoslik tekshiruvlari amalga oshirilganligi sababli, ish vaqtining ko'payishi natijasi bo'ladi. Belgilangan kirish parametrlari primer va problarni loyihalash va dimerizatsiya tekshiruvlari uchun zarur. Foydalanish holatiga qarab, standart parametrlar mazmunli sozlanishi kerak. Bir nechta maqolalarda astar va problar dizayni uchun optimallashtirilgan parametrlarni tanlash batafsil tasvirlangan (3, 4, 17, 18). Bundan tashqari, oli2go ligatsiyaga asoslangan tajribalarda ishlatiladigan ikki qismli gibridizatsiya probalarini yaratish variantini qo'llab-quvvatlaydi.

Fayl tayyorlash

Kirish ketma-ketliklari birinchi navbatda Milliy Biotexnologiya Axborot Markazining (NCBI) Mahalliy moslashtirishni qidirish vositasining (BLAST) 2.7.0+ versiyasining mustaqil versiyasi va ma'lumotlar bazalarining to'liq to'plami (1-jadval) yordamida tekislanadi. Bu ma'lumotlar bazalari NCBI Fayllarni uzatish protokoli (FTP) serveridan yuklab olingan bakteriyalar, viruslar, zamburug'lar, arxeyalar, umurtqasizlar, atrof -muhit namunalari, protozoyalar, o'simliklar va butun genom ov miltig'i (WGS) loyihalarini o'z ichiga olgan & gt100 million ketma -ketlik fayllari to'plami. . Foydalanuvchi faylni tayyorlash va tekshirish o'ziga xosligini tekshirish uchun ma'lumotlar bazalarini tanlaydi. BLAST natijalari so'rovlar ketma -ketligiga & gt90% ketma -ketlik o'xshashligini ko'rsatadigan va problarning o'ziga xosligini tekshirish uchun asos bo'lgan barcha natijalarni o'z ichiga oladi.

Probning o'ziga xosligini tekshirish uchun ishlatiladigan NCBI ma'lumotlar bazasi manbalari

Manba . Tartiblar soni. Ma'lumotlar bazasi fraktsiyasi.
Bakteriyalar 7 658 345 7.55%
Atrof-muhit namunalari 7 276 975 7.18%
Umurtqasizlar 27 651 271 27.27%
Patentlangan ketma-ketliklar 31 140 928 30.71%
O'simliklar 3 798 824 3.75%
Viruslar 1 837 439 1.81%
Arxeya 38 310 0.04%
Qo'ziqorinlar 3 889 143 3.84%
Protozoa 3 880 518 3.83%
WGS loyihasi ketma -ketligi 14 220 046 14.02%
Ketma-ketliklarning umumiy miqdori 101 391 799 100.00%
Manba . Tartiblar soni. Ma'lumotlar bazasi fraktsiyasi.
Bakteriyalar 7 658 345 7.55%
Atrof -muhit namunalari 7 276 975 7.18%
Umurtqasizlar 27 651 271 27.27%
Patentlangan ketma -ketliklar 31 140 928 30.71%
O'simliklar 3 798 824 3.75%
Viruslar 1 837 439 1.81%
Arxeya 38 310 0.04%
Qo'ziqorinlar 3 889 143 3.84%
Protozoa 3 880 518 3.83%
WGS loyihasi ketma -ketligi 14 220 046 14.02%
Ketma-ketliklarning umumiy miqdori 101 391 799 100.00%

Barcha ma'lumotlar havzasidagi ketma -ketliklar soni va ularning ulushi keltirilgan.

Manba . Tartiblar soni. Ma'lumotlar bazasi fraktsiyasi.
Bakteriyalar 7 658 345 7.55%
Atrof-muhit namunalari 7 276 975 7.18%
Umurtqasizlar 27 651 271 27.27%
Patentlangan ketma-ketliklar 31 140 928 30.71%
O'simliklar 3 798 824 3.75%
Viruslar 1 837 439 1.81%
Arxeya 38 310 0.04%
Qo'ziqorinlar 3 889 143 3.84%
Protozoa 3 880 518 3.83%
WGS loyihasi ketma -ketligi 14 220 046 14.02%
Ketma-ketliklarning umumiy miqdori 101 391 799 100.00%
Manba . Tartiblar soni. Ma'lumotlar bazasi fraktsiyasi.
Bakteriyalar 7 658 345 7.55%
Atrof -muhit namunalari 7 276 975 7.18%
Umurtqasizlar 27 651 271 27.27%
Patentlangan ketma -ketliklar 31 140 928 30.71%
O'simliklar 3 798 824 3.75%
Viruslar 1 837 439 1.81%
Arxeya 38 310 0.04%
Qo'ziqorinlar 3 889 143 3.84%
Protozoa 3 880 518 3.83%
WGS loyiha ketma-ketligi 14 220 046 14.02%
Ketma-ketliklarning umumiy miqdori 101 391 799 100.00%

Barcha ma'lumotlar havzasidagi ketma -ketliklar soni va ularning ulushi keltirilgan.

Primer va problarni tanlash

Astarlar va problarni tanlash yaratilishidan boshlanadi k-mers, foydalanuvchi tomonidan belgilangan minimal astar va zond o'lchamidan tortib, 1 qadam o'lchamidan foydalangan holda maksimal hajmgacha. Tm har biri uchun hisoblanadi k-mer (16, 19). Nomzodlar qaerda Tm Belgilangan diapazonda bo'lsa, keyin soch tolasi shakllanishi tekshiriladi. Soch qisqichini tekshirish Primer3 ning ntthal nukleotid termodinamik moslashtirish vositasi yordamida amalga oshiriladi (12). Ushbu dastur ikkilamchi tuzilmani hisoblash uchun SantaLucia tomonidan taklif qilingan termodinamik parametrlar jadvallaridan foydalanadi. Tm va ΔG eng barqaror dupleksning qiymati (16). Oligonukleotidlar ikkilamchi tuzilishga ega bo'lsa qabul qilinadi Tm va ΔG qiymat foydalanuvchi tomonidan belgilangan chegaralardan past.

Probning o'ziga xosligini tekshirish

Probning o'ziga xosligini tekshirish oli2go ning asosiy xususiyatlaridan biridir. Ushbu qadam BLAST bilan har bir mumkin bo'lgan nomzodni foydalanuvchi tomonidan aniqlangan ma'lumotlar bazalariga nisbatan tahlil qiladi (1-jadval). Hizalanish natijalari fayllarni tayyorlash jarayonining bosqichida hosil bo'lgan maqsadli ketma -ketlik bilan taqqoslanadi. Faqat maqsadli ketma -ketlik bilan bog'langan problar qabul qilinadi.

Primer ta'rifi va o'ziga xosligini tekshirish

Oligonukleotid gibridlanishining yonbag'ridagi oldinga va teskari primer nomzodlarni topish uchun avvalgi o'ziga xoslikni tekshirish natijasida olingan maxsus problar ishlatiladi. Probning aniqlash qobiliyati bog'langan primerlarning o'ziga xosligiga va oldingi DNKni kuchaytirish reaktsiyasiga bog'liq. Oli2go belgilangan o'lcham oralig'ida mahsulot ishlab chiqaradigan, bir-biri bilan hech qanday ikkilamchi tuzilma hosil qilmaydigan va D dagi minimal farqni ko'rsatadigan mos primer juftlarini (har biri bitta oldinga va teskari primerga ega) chiqaradi.G qiymatlar. Primerning o'ziga xosligini tekshirish primerning inson fon DNKsi bilan bog'lanish xavfini minimallashtirish uchun amalga oshiriladi. Primer nomzodlar Burrows-Wheeler Aligner (BWA) yordamida NCBI FTP serveridan yuklab olingan inson mos yozuvlar genomiga moslashtiriladi (20).

Primer dimer tekshiruvi

O'zaro faoliyat dimer yoki primer dimer tekshiruvi multipleks reaktsiyalarida primer ish faoliyatini optimallashtirish uchun muhim dizayn bosqichidir. Oli2go Primer3 ning ntthal va foydalanuvchi tomonidan aniqlangan uses dan foydalanadiG va Tm o'zaro o'lchovlarni tekshirish uchun qiymatlar. Oldingi dizayn topshirig'idan kelib chiqadigan maxsus oldinga va teskari primer juftliklari ish jarayonining oxirgi bosqichi uchun kirishdir. Bu eng kam aniq primerlarga ega kirish ketma -ketligidan boshlanadi. Ushbu primerlar boshqa kirish ketma-ketliklarining boshqa barcha mumkin bo'lgan primerlariga nisbatan tekshiriladi. Birinchi natijalar o'zaro dimerizatsiya chegaralaridan oshmaydigan primer juftlarini o'z ichiga oladi. If the results contain at least one primer pair for each sequence, each one is checked against the other primers in the results. Finally, for each input sequence one primer pair forming no cross dimerization with all other sequences is returned.

Chiqish

The output is presented on a separate web-page and includes a table showing the resulting primers and probes, their Tm’s, product sizes, hairpin Tm’s, and ΔG qiymatlar. The table also contains web links to NCBI’s online BLAST and Primer-BLAST to perform additional analysis. This table can also be downloaded as comma-separated values (CSV) file. Furthermore, primer and probe sequences as well as the initial input sequences are available in FASTA format. The used design parameters can be downloaded as text file.

Amalga oshirish

The software workflow runs on a Linux server (64 CPUs, 256GB RAM). The main software packages used for the implementation are BLAST 2.7.0+, ntthal (which is part of Primer3 2.3.7), BWA, and Python 2.7 together with the Biopython library ( 21). In order to maximize the utilization of the server resources, most of the workflow steps are running in parallel using multithreading. The highly responsive user interface is implemented using Bootstrap 3.3.7 and enables the user to use oli2go on almost any device capable of entering the internet via browser ranging from Laptops, Tablets to Smartphones. Oli2go is freely accessible to all users at http://oli2go.ait.ac.at/.


A new feature was added to Primer-BLAST.

Tue, 29 Sep 2020 12:00:00 EST

We have added a new function to Primer-BLAST that helps users design primers common for a group of highly similar sequences.

Many users want to test if a gene is expressed but they don’t know or they don't care which transcripts are expressed. However, they do want primers to cover all transcript variants. Additionally, some users would like to have primers to cover a group of highly related bacteria strains.

Given a group of highly similar sequences, Primer-BLAST attempts to generate primers that are common for all sequences in this group. To find such primers, it uses BLAST to align the longest sequence among the group to the rest to find common regions which are then used to limit the locations of primers. The longest sequence is also used as the representative template sequence.

See the NCBI Insights post for an example search and more details.


Family-Specific Degenerate Primer Design: A Tool to Design Consensus Degenerated Oligonucleotides

Designing degenerate PCR primers for templates of unknown nucleotide sequence may be a very difficult task. In this paper, we present a new method to design degenerate primers, implemented in family-specific degenerate primer design (FAS-DPD) computer software, for which the starting point is a multiple alignment of related amino acids or nucleotide sequences. To assess their efficiency, four different genome collections were used, covering a wide range of genomic lengths: Arenavirus (

nucleotides), Baculovirus (

bp), Lactobacillus sp. (

bp), and Pseudomonas sp. ( to

bp). In each case, FAS-DPD designed primers were tested computationally to measure specificity. Designed primers for Arenavirus va Baculovirus were tested experimentally. The method presented here is useful for designing degenerate primers on collections of related protein sequences, allowing detection of new family members.

1.Kirish

The polymerase chain reaction (PCR), one of the most important analytical tools of molecular biology, allows a highly sensitive detection and specific genotyping of environmental samples, specially important in the metagenomic era [1]. A large list of genome typing applications includes arbitrarily primed PCR [2] (AP-PCR), random amplified primed DNAs [3] (RAPDs), PCR restriction fragment length polymorphism [4] (PCR-RFLP), and direct amplification of length polymorphism [5] (DALP). All of these techniques require a high quality and purity of the specific target template, because any available DNA could be substrate for the amplification step. In view of this, genotyping procedures of large genomes or complex samples are more reliable if they are based on DNA amplification using specific oligonucleotides. Therefore, primer design is crucial for efficient and successful amplification.

Several primer design programs are available (e.g., OLIGO [6], OSP [7, 8], Primer Master [9], PRIDE [10], Primer3 [11], among others). Regardless of each computational working strategy, all of these use a set of common criteria (e.g.,

content, melting temperature, etc.) to evaluate the quality of primer candidates in a specific target region selected by the user. Alternative programs are aimed at more specific purposes, such as selection of primers that bind to conserved genomic regions based on multiple sequence alignments [12, 13], primer design for selective amplification of protein-coding regions [14], oligonucleotide design for site-directed mutagenesis [15], and primer design for hybridization [16]. Usually, the design of truly specific primers requires the information of the complete nucleotide sequence. This is the starting point for most of the programs described in the literature. However, the need of designing specific primers is not always accompanied by the complete knowledge of the target genome sequence.

A primer, or more generally any DNA sequence, is called specific if it represents a unique sequence and is called degenerate if it represents a collection of unique sequences. For example, the amino acid sequence “YHP” could be coded by “TATCATCCC,” “TACCATCCA,” or “TACCACCCG,” among others all of these are unique sequences that can be summarized in a “degenerate” nucleotide sequence “TAYCARCCN,” using IUPAC code. Operatively, the use of a degenerate primer implies the use of a population of specific primers that cover all the possible combinations of nucleotide sequences coding for a given protein sequence. Also, primers including modified bases can be used. Some modified bases can match different bases.

Although the increase in degeneracy rises the chance of unspecific annealing of the designed primers, it also increases the probability of finding unknown divergent variants of a sequence family. This dual behavior must be taken into account during the design. Algorithmic search of primers that include degenerated positions is usually defined as the degenerate primer design (DPD) problem. In recent years, several methods were developed to solve DPD problem. Each one has a specific scope or is designed to solve a variant of the problem, but all of them aim to minimize the number of degenerations of the resulting primers.

The DPD problem was expressed in different ways by many researchers. Linhart and Shamir [17] presented the maximum coverage DPD problem (MC-DPD), with the goal of finding a primer that covers the maximum number of input sequences. The selection of primers is constrained by limiting the maximum degeneracy. They also stated the minimum degeneracy DPD problem (MD-DPD), in which the objective is finding a primer with the minimum degeneracy that covers all the input sequences. To solve MC-DPD they have developed the HYDEN program [18]. Wei et al. [19] developed the DePiCt program that uses hierarchical clustering of protein blocks to design the primers. Rose et al. [20] developed a method for hybrid degenerate-nondegenerate primers, where the 3′ region is degenerated and its 5′ region is a consensus clamp. It was implemented in CODEHOP [21] and iCODEHOP [22] programs and was used to search new members of protein families and for identification and characterization of viral genomes. Balla and Rajasekaran [23] described a method for a variant of MD-DPD that tolerates mismatch errors, implemented in the minDPS program. The programs PT-MIPS and PAMPS address mainly the problem of multiple degenerate primer design. The aim of these programs is finding the minimum number of degenerate primers that cover all the input sequences, taking into account that none of them may be more degenerated than an input value.

In this study a new method for solving the DPD problem is proposed, in which the focus is shifted away from the global minimum degenerated primer in favor of maximizing a score value which contains degeneracy but weighted by its proximity to the 3′ end of the primer. This minimizes the degeneracy at that end while allowing more freedom in the remaining positions. Hereby, the best scoring primers may not be the less degenerated, but take into account a biological restraint that is not so heavily considered in other methods. The 3′ end is the essential anchoring site because it is where the polymerase initiates its activity. From a strategic point of view, a decision must be made whether or not to allow degeneracy at this end. The presence of degeneracy at the 3′ end probably assures a greater diversity of sequences to be detected. However, at the same time, it diminishes the proportion of primer specific for a given sequence. Therefore, we decided to be very strict in the search of conserved regions and minimize the amount of degeneracy incorporated at this end. If the input set of sequences is sufficiently large, it is highly probable that a region identified as conserved among all known sequences will likewise be conserved in any new member of the family.

2. Scoring and Primer Search Strategy

The method presented here can be used starting with DNA or protein sequence alignments (Figure 1(a)). If the input was DNA, sequences were aligned to obtain one global degenerate DNA consensus. If the input was a protein alignment, each protein of the alignment is backtranslated into a degenerate DNA sequence. All the degenerate DNA sequences were combined in one global degenerate DNA consensus. This consensus sequence covers all the putative input sequences that could be the origin of each protein sequence (Figure 1(b)). Also, the consensus sequence may code for amino acids that were not detected in the known sequences. This is inevitable given the kind of degeneracy of the genetic code.


(a)
(b)

Manbalar

Richardson AO, Palmer JD: Horizontal gene transfer in plants. J Exp Bot. 2007, 58 (1): 1-9.

Acuna R, Padilla BE, Florez-Ramos CP, Rubio JD, Herrera JC, Benavides P, Lee SJ, Yeats TH, Egan AN, Doyle JJ: Adaptive horizontal transfer of a bacterial gene to an invasive insect pest of coffee. Proc Natl Acad Sci AQSh. 2012, 109 (11): 4197-4202.

Davies J, Davies D: Origins and evolution of antibiotic resistance. Microbiol Mol Biol Rev. 2010, 74 (3): 417-433. 10.1128/MMBR.00016-10.

Ochman H, Lawrence JG, Groisman EA: Lateral gene transfer and the nature of bacterial innovation. Tabiat. 2000, 405 (6784): 299-304. 10.1038/35012500.

Dobrindt U, Hochhut B, Hentschel U, Hacker J: Genomic islands in pathogenic and environmental microorganisms. Nat Rev Microbiol. 2004, 2 (5): 414-424. 10.1038/nrmicro884.

Keeling PJ, Palmer JD: Horizontal gene transfer in eukaryotic evolution. Nat Rev Genet. 2008, 9 (8): 605-618. 10.1038/nrg2386.

Feschotte C, Pritham EJ: DNA transposons and the evolution of eukaryotic genomes. Annu Rev Genet. 2007, 41: 331-368. 10.1146/annurev.genet.40.110405.090448.

Schaack S, Gilbert C, Feschotte C: Promiscuous DNA: horizontal transfer of transposable elements and why it matters for eukaryotic evolution. Trends Ecol Evol. 2010, 25 (9): 537-546. 10.1016/j.tree.2010.06.001.

Cho Y, Qiu YL, Kuhlman P, Palmer JD: Explosive invasion of plant mitochondria by a group I intron. Proc Natl Acad Sci AQSh. 1998, 95 (24): 14244-14249. 10.1073/pnas.95.24.14244.

Bergthorsson U, Adams KL, Thomason B, Palmer JD: Widespread horizontal transfer of mitochondrial genes in flowering plants. Tabiat. 2003, 424 (6945): 197-201. 10.1038/nature01743.

Won H, Renner SS: Horizontal gene transfer from flowering plants to Gnetum. Proc Natl Acad Sci AQSh. 2003, 100 (19): 10824-10829. 10.1073/pnas.1833775100.

Bergthorsson U, Richardson AO, Young GJ, Goertzen LR, Palmer JD: Massive horizontal transfer of mitochondrial genes from diverse land plant donors to the basal angiosperm Amborella. Proc Natl Acad Sci AQSh. 2004, 101 (51): 17747-17752. 10.1073/pnas.0408336102.

Davis CC, Wurdack KJ: Host-to-parasite gene transfer in flowering plants: phylogenetic evidence from Malpighiales. Ilm. 2004, 305 (5684): 676-678. 10.1126/science.1100671.

Mower JP, Stefanovic S, Young GJ, Palmer JD: Plant genetics: gene transfer from parasitic to host plants. Tabiat. 2004, 432 (7014): 165-166.

Davis CC, Anderson WR, Wurdack KJ: Gene transfer from a parasitic flowering plant to a fern. Proc Biol Sci. 2005, 272 (1578): 2237-2242. 10.1098/rspb.2005.3226.

Diao X, Freeling M, Lisch D: Horizontal transfer of a plant transposon. PLoS Biol. 2006, 4 (1): e5-10.1371/journal.pbio.0040005.

Barkman TJ, McNeal JR, Lim SH, Coat G, Croom HB, Young ND, Depamphilis CW: Mitochondrial DNA suggests at least 11 origins of parasitism in angiosperms and reveals genomic chimerism in parasitic plants. BMC Evol Biol. 2007, 7: 248-10.1186/1471-2148-7-248.

Goremykin VV, Salamini F, Velasco R, Viola R: Mitochondrial DNA of Vitis vinifera and the issue of rampant horizontal gene transfer. Mol Biol Evol. 2009, 26 (1): 99-110.

Yoshida S, Maruyama S, Nozaki H, Shirasu K: Horizontal gene transfer by the parasitic plant Striga hermonthica. Ilm. 2010, 328 (5982): 1128-10.1126/science.1187145.

Sanchez-Puerta MV, Cho Y, Mower JP, Alverson AJ, Palmer JD: Frequent, phylogenetically local horizontal transfer of the cox1 group I Intron in flowering plant mitochondria. Mol Biol Evol. 2008, 25 (8): 1762-1777. 10.1093/molbev/msn129.

Christin PA, Edwards EJ, Besnard G, Boxall SF, Gregory R, Kellogg EA, Hartwell J, Osborne CP: Adaptive evolution of C(4) photosynthesis through recurrent lateral gene transfer. Curr Biol. 2012, 22 (5): 445-449. 10.1016/j.cub.2012.01.054.

Vallenback P, Jaarola M, Ghatnekar L, Bengtsson BO: Origin and timing of the horizontal transfer of a PgiC gene from Poa to Festuca ovina. Mol Phylogenet Evol. 2008, 46 (3): 890-896. 10.1016/j.ympev.2007.11.031.

Hepburn NJ, Schmidt DW, Mower JP: Loss of Two Introns from the Magnolia tripetala Mitochondrial cox2 Gene Implicates Horizontal Gene Transfer and Gene Conversion as a Novel Mechanism of Intron Loss. Mol Biol Evol. 2012, 29 (10): 3111-3120. 10.1093/molbev/mss130.

Park JM, Manen JF, Schneeweiss GM: Horizontal gene transfer of a plastid gene in the non-photosynthetic flowering plants Orobanche and Phelipanche (Orobanchaceae). Mol Phylogenet Evol. 2007, 43 (3): 974-985. 10.1016/j.ympev.2006.10.011.

Xi Z, Bradley RK, Wurdack KJ, Wong KM, Sugumaran M, Bomblies K, Rest JS, Davis CC: Horizontal transfer of expressed genes in a parasitic flowering plant. BMC Genomics. 2012, 13 (1): 227-10.1186/1471-2164-13-227.

Birschwilks M, Haupt S, Hofius D, Neumann S: Transfer of phloem-mobile substances from the host plants to the holoparasite Cuscuta sp. J Exp Bot. 2006, 57 (4): 911-921. 10.1093/jxb/erj076.

Tomilov AA, Tomilova NB, Wroblewski T, Michelmore R, Yoder JI: Trans-specific gene silencing between host and parasitic plants. Plant J. 2008, 56 (3): 389-397. 10.1111/j.1365-313X.2008.03613.x.

Westwood JH, Roney JK, Khatibi PA, Stromberg VK: RNA translocation between parasitic plants and their hosts. Pest Manag Sci. 2009, 65 (5): 533-539. 10.1002/ps.1727.

Louis S, Delobel B, Gressent F, Rahioui I, Quillien L, Vallier A, Rahbe Y: Molecular and biological screening for insect-toxic seed albumins from four legume species. O'simlik fanlari. 2004, 167 (4): 705-714. 10.1016/j.plantsci.2004.04.018.

Louis S, Delobel B, Gressent F, Duport G, Diol O, Rahioui I, Charles H, Rahbe Y: Broad screening of the legume family for variability in seed insecticidal activities and for the occurrence of the A1b-like knottin peptide entomotoxins. Phytochemistry. 2007, 68 (4): 521-535. 10.1016/j.phytochem.2006.11.032.

Gelly JC, Gracy J, Kaas Q, Le-Nguyen D, Heitz A, Chiche L: The KNOTTIN website and database: a new information system dedicated to the knottin scaffold. Nuklein kislotalari. 2004, 32 (Database issue): D156-D159.

Clark RJ, Jensen J, Nevin ST, Callaghan BP, Adams DJ, Craik DJ: The engineering of an orally active conotoxin for the treatment of neuropathic pain. Angew Chem Int Ed Engl. 2010, 49 (37): 6545-6548. 10.1002/anie.201000620.

Wang X, Connor M, Smith R, Maciejewski MW, Howden ME, Nicholson GM, Christie MJ, King GF: Discovery and characterization of a family of insecticidal neurotoxins with a rare vicinal disulfide bridge. Nat Struct Biol. 2000, 7 (6): 505-513. 10.1038/75921.

Jackson PJ, McNulty JC, Yang YK, Thompson DA, Chai B, Gantz I, Barsh GS, Millhauser GL: Design, pharmacology, and NMR structure of a minimized cystine knot with agouti-related protein activity. Biokimyo. 2002, 41 (24): 7565-7572. 10.1021/bi012000x.

Clark RJ, Daly NL, Craik DJ: Structural plasticity of the cyclic-cystine-knot framework: implications for biological activity and drug design. Biochem J. 2006, 394 (Pt 1): 85-93.

Combelles C, Gracy J, Heitz A, Craik DJ, Chiche L: Structure and folding of disulfide-rich miniproteins: insights from molecular dynamics simulations and MM-PBSA free energy calculations. Proteinlar. 2008, 73 (1): 87-103. 10.1002/prot.22054.

Silverman AP, Levin AM, Lahti JL, Cochran JR: Engineered cystine-knot peptides that bind alpha(v)beta(3) integrin with antibody-like affinities. J Mol Biol. 2009, 385 (4): 1064-1075. 10.1016/j.jmb.2008.11.004.

Lewis GP: Legumes of the World. 2005, Kew: Royal Botanic Gardens

Joel DM: The new nomenclature of Orobanche and Phelipanche. Weed Res. 2009, 49: 6-7.

Schneeweiss GM: Correlated evolution of life history and host range in the nonphotosynthetic parasitic flowering plants Orobanche and Phelipanche (Orobanchaceae). J Evol Biol. 2007, 20 (2): 471-478. 10.1111/j.1420-9101.2006.01273.x.

Soltis DE, Smith SA, Cellinese N, Wurdack KJ, Tank DC, Brockington SF, Refulio-Rodriguez NF, Walker JB, Moore MJ, Carlsward BS: Angiosperm phylogeny: 17 genes, 640 taxa. Am J Bot. 2011, 98 (4): 704-730. 10.3732/ajb.1000404.

Parker C: Observations on the current status of Orobanche and Striga problems worldwide. Pest Manag Sci. 2009, 65 (5): 453-459. 10.1002/ps.1713.

Altschul SF, Madden TL, Schaffer AA, Chjan J, Chjan Z, Miller V, Lipman DJ: Gapped BLAST va PSI-BLAST: yangi avlod oqsil ma'lumotlar bazasini qidirish dasturlari. Nuklein kislotalari. 1997, 25 (17): 3389-3402. 10.1093/nar/25.17.3389.

Westwood JH, Yoder JI, Timko MP, dePamphilis CW: The evolution of parasitism in plants. Trends Plant Sci. 2010, 15 (4): 227-235. 10.1016/j.tplants.2010.01.004.

Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N: Phytozome: a comparative platform for green plant genomics. Nuklein kislotalari. 2012, 40 (Database issue): D1178-D1186.

Wojciechowski MF, Lavin M, Sanderson MJ: A phylogeny of legumes (Leguminosae) based on analysis of the plastid matK gene resolves many well-supported subclades within the family. Am J Bot. 2004, 91 (11): 1846-1862. 10.3732/ajb.91.11.1846.

Lavin M, Herendeen PS, Wojciechowski MF: Evolutionary rates analysis of Leguminosae implicates a rapid diversification of lineages during the tertiary. Sist Biol. 2005, 54 (4): 575-594. 10.1080/10635150590947131.

Gracy J, Le-Nguyen D, Gelly JC, Kaas Q, Heitz A, Chiche L: KNOTTIN: the knottin or inhibitor cystine knot scaffold in 2007. Nucleic Acids Res. 2008, 36 (Database issue): D314-D319.

Westwood JH: The Parasitic Plant Genome Project: New Tools for Understanding the Biology of Orobanche and Striga. Weed Sci. 2012, 60 (2): 295-306. 10.1614/WS-D-11-00113.1.

Schneeweiss GM, Colwell A, Park JM, Jang CG, Stuessy TF: Phylogeny of holoparasitic Orobanche (Orobanchaceae) inferred from nuclear ITS sequences. Mol Phylogenet Evol. 2004, 30 (2): 465-478. 10.1016/S1055-7903(03)00210-0.

Schneeweiss GM, Palomeque T, Colwell AE, Weiss-Schneeweiss H: Chromosome numbers and karyotype evolution in holoparasitic Orobanche (Orobanchaceae) and related genera. Am J Bot. 2004, 91 (3): 439-448. 10.3732/ajb.91.3.439.

Manen JF, Habashi C, Jeanmonod D, Park JM, Schneeweiss GM: Phylogeny and intraspecific variability of holoparasitic Orobanche (Orobanchaceae) inferred from plastid rbcL sequences. Mol Phylogenet Evol. 2004, 33 (2): 482-500. 10.1016/j.ympev.2004.06.010.

Nickrent D: The Parasitic Plant Connection. http://www.parasiticplants.siu.edu/,

Johnson F: Transmission of plant viruses by dodder. Phytopathology. 1941, 31 (7): 649-656.

Bennett CW: Studies of dodder transmission of plant viruses. Phytopathology. 1944, 34 (10): 905-932.

Roney JK, Khatibi PA, Westwood JH: Cross-species translocation of mRNA from host plants into the parasitic plant dodder. O'simlik fiziol. 2007, 143 (2): 1037-1043.

David-Schwartz R, Runo S, Townsley B, Machuka J, Sinha N: Long-distance transport of mRNA via parenchyma cells and phloem across the host-parasite junction in Cuscuta. Yangi fitol. 2008, 179 (4): 1133-1141. 10.1111/j.1469-8137.2008.02540.x.

Olmstead RG, dePamphilis CW, Wolfe AD, Young ND, Elisons WJ, Reeves PA: Disintegration of the Scrophulariaceae. Am J Bot. 2001, 88 (2): 348-361. 10.2307/2657024.

Edgar RC: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinformatika. 2004, 5: 113-10.1186/1471-2105-5-113.

Stamatakis A: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatika. 2006, 22 (21): 2688-2690. 10.1093/bioinformatics/btl446.

Drummond AJ, Rambaut A: BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007, 7: 214-10.1186/1471-2148-7-214.

Sanderson MJ: r8s: inferring absolute rates of molecular evolution and divergence times in the absence of a molecular clock. Bioinformatika. 2003, 19 (2): 301-302. 10.1093/bioinformatics/19.2.301.

Gracy J, Chiche L: Optimizing structural modeling for a specific protein scaffold: knottins or inhibitor cystine knots. BMC Bioinformatika. 2010, 11: 535-10.1186/1471-2105-11-535.

Pond SL, Frost SD, Muse SV: HyPhy: hypothesis testing using phylogenies. Bioinformatika. 2005, 21 (5): 676-679. 10.1093/bioinformatics/bti079.

Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatika. 2009, 25 (14): 1754-1760. 10.1093/bioinformatics/btp324.

Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The Sequence Alignment/Map format and SAMtools. Bioinformatika. 2009, 25 (16): 2078-2079. 10.1093/bioinformatics/btp352.

Quinlan AR, Hall IM: BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatika. 2010, 26 (6): 841-842. 10.1093/bioinformatics/btq033.


Materiallar

It is noteworthy that the design of the degenerate primer pair reported in this present study was effectively executed by the synergy of different software programs and web servers. The software programs used here include the open-sourced Highly Degenerate primer (HYDEN) design program accessible from (http://acgt.cs.tau.ac.il/hyden/hyden_license.html) [7], FastPCR v6.7 (http://primerdigital.com/Fastpcr.html) [14], Geneious Prime software version2020.1.2 (www.geneious.com/prime/). The degenerate primer pair reported in this study was designed on a hp personal computer composed of a 64-bit operating system, ×64-based processor, 2 CPUs, and a storage of 500 GB. The material used in this study were 88 catA genes from authentic bacterial strains known to possess the catabolic gene. The gene sequences were downloaded in FASTA format from NCBI database accessible from (https://ncbi.nlm.nih.gov). Files interconversion from the extension .txt to FASTA format was achieved through an open-sourced web server accessible from (http://www.hiv.lanl.gov/content/sequence/FORMAT_CONVERSION/form.html).


Muhokama

We developed and curated a reference database for 67 fish species, belonging to 54 genera that are widespread across the Neotropical realm, and used it to develop a 12S mini-barcode marker and estimate a genetic distance threshold value for Neotropical fish species delimitation. Having a reference database associated with mini-barcode primer sets specific for Neotropical species is an important asset for DNA metabarcoding, especially when analyzing eDNA samples from such megadiverse fauna 21,22 .

The taxonomic resolution of 12S full and mini barcodes libraries provided enough molecular polymorphism to differentiate all 67 morpho-species. Moreover, the 12S full-length barcode (ca. 565 bp) was sufficient to discriminate all 70 MOTUs, which was in accordance with previous molecular (COI based) identifications of the same specimens 28 . Interestingly, the mini-barcode region’s (i.e. 193 bp—NeoFish_3) taxonomic resolution performed similarly to the full-length database, providing the same number of MOTUs when applying the GMYC and genetic distances thresholds analyses (70 MOTUs). The other analyses of the mini-barcode dataset overestimated the number of MOTUs (bPTP with 76) or underestimated it (ABGD with 67 MOTUs).

When performing genetic distance threshold analysis using the full-length library, we obtained a threshold value (0.40%, Fig. 4a) similar to our mini-barcode region (0.55%, Fig. 4b). Fish species delimitation threshold values based on the 12S region are an important reference for future studies using this marker, but they may need to establish a priori reference value when interpreting genetic distance data, such as the 2% widely used for COI 53 . Although we have analyzed several genera from all major Neotropical fish taxa, it is important to note that its value will be more robust and better reflect the real divergence between species when more species are added to our reference database.

Species delimitation and taxonomic resolution analyses revealed the potential of NeoFish_3 amplicons to reliably identify species, since there was no relevant disparity between full-length and mini barcode libraries for these analyses. Similar results were obtained for the COI gene, as a comparison between full-length and mini barcodes, especially when it was used in degraded samples. This demonstrates that the latter is informative for species-level sorting of: (1) major eukaryotic groups and archival specimens 45 (2) moth and wasp museum specimens 54 , and (3) several bird species 55 . However, few congeneric species have been analyzed in this study, and thus, to overcome this putative drawback, future analyses should include a higher number of species from the same genus to provide even more robust results.

SWAN analysis showed that the target NeoFish_3 amplicon would be the best region for taxonomic differentiation of species since it recovered the best indices in all established criteria (Fig. 2). However, we did not analyze the whole 12S gene of all species to proper compare the NeoFish_3 to other previously used amplicons (MifishU and Teleo1) using characteristics such as taxonomic resolution and best primer site. The target 12S rRNA gene region used to build our reference database represents approximately 60% of the 12S full-length gene (952 bp) (Fig. 1a) and includes only a small fragment of the 12S region amplified by the MiFishU marker and also the initial region of the forward Teleo1 (Fig. 1b).

In vitro tests showed that the newly developed NeoFish_3 marker is efficient and thus, was able to amplify the target region of the 12S rRNA gene from 22 tissue DNA extracts and environmental DNA recovered from an aquarium containing one fish species (Supplementary Table S1 Fig. S1). However, further evaluation of amplification success with samples obtained from Neotropical river basins using a DNA metabarcoding approach for a whole fish community is recommended, as different types of environmental samples will vary in patterns of DNA degradation and exposure to inhibitors 33 . Although 67 fish species represent a low percentage of the Neotropical freshwater fish species, they nevertheless account for the main Neotropical orders, since we include DNA of species from Characiformes, Cyprinodontiformes, Gymnotiformes, Perciformes, Siluriformes, and Synbranchiformes.

Amplification of non-target organisms has been previously reported as a drawback of universal eDNA available primer sets that led to the use of human blocking primers to avoid cross amplification. When comparing amplification of non-target taxa to previously designed primers sets (Teleo1 and MiFishU), a better specificity of NeoFish_3 was detected with our in silico PCR analysis. For Teleo1 and MiFishU the amplification rate for Mammalia, including Homo sapiens, was over 1000 sequences (Table 2), while the NeoFish_3 had no cross amplification of these. Moreover, when using the Teleo1 and MiFishU markers to assess fish communities diversity in French Guiana 21 and Japan 31 , both papers report amplification of DNA from insects and mammals when analyzing eDNA samples. Such untargeted amplification and detection in eDNA studies may hamper the identification of rare species since it may consume most of the DNA sequences obtained 29,56 . However, before assuming that NeoFish_3 outperformed other 12S mini-barcode markers, in situ tests would be needed to check if there would indeed be lower amplification of non-targeted species.

Herein, we applied a powerful framework for the development and validation of a fish-specific primer set together with a custom reference database aimed at DNA metabarcoding analysis in the Neotropical realm. Species delimitation analyses strongly suggest that even when using a short region of the 12S mitochondrial region, we could discriminate each taxon to the species level. In addition, we were able to set an interspecific distance-based threshold for species delimitation that would be helpful throughout bioinformatics metabarcoding short reads analysis. Thus, our custom reference database and mini-barcodes markers are an important asset for an ecoregion scale DNA based biodiversity evaluation, such as eDNA metabarcoding, that can help with the complex task of conserving the megadiverse Neotropical ichthyofauna.


Videoni tomosha qiling: Suvning qattiqligi va uni yumshatish usullari (Avgust 2022).