TSSFinder

bioen

TSSFinder
Please, enter the files to upload and choose the model.
*required fields.
**Maximum File Size = 150 MB
Fields Data
Organism Model*
Start codon position (.bed file)*
Fasta sequence*
 

This is the web server version of TSSFinder. TSSFinder searches for the closest TSS relative to the start codon of a gene. Searches are limited to the region of 2000 nt upstream of the start codon position.

 


TO SUBMIT A JOB

 

1.     A multi-fasta sequence with the genomic region (maximum size: 150 MB)

2.     A BED file containing the location of the start codons. (Maximum of 150 entries). This file will serve to anchor the TSS search. TSS searches are performed in the upstream region of the start codon positions. TSS will scan up to 2.000 nucleotides in this search.

 

NOTE: If you need to process a large amount of data, please install TSSFinder in your local computer. You can find the source code for Ubuntu Linux and a Docker container at https://tssfinder.github.io

 

THE RESULTS

 

TSS will produce TWO BED files which can be downloaded. One with the position of the predicted TSS for each gene, other with the predicted positions of eventual TATA boxes.

 

FILE FORMATS

 

FASTA

 

Fasta files should contain the genomic sequence where the prediction is going to be performed. The genomic sequence could be a whole chromosome or the 3k bp upstream of each target gene, for example. This website accepts multi-fasta files. The contents of the header of each fasta sequence will be “name" of the sequence. The “name" will be all the characters up to the first blank or tab.

>GenomeSequence1_1. XPTO1

CTGATTTTCGTTGGCCCTAGATTTCATCAATCTCTAATTTCATTTTGTATTTTTATCGTTTTGAAATTTAAATGTCAAGTC

CCAACGGTCCTCTGATCTCGGCAGTTTTTGTGTTATGTAAATGA

CTAACATACCCTTACTTAACACGTGTCTTTCTCTTTCCTTTTAATGGGCCGGATTCTAGTTTGGGTCCAGTTATAATTTC

CTTTGTGTTTCGAATTAGGTTTAAATTTTACTTTATTAGAATTAA

GCCCAATAACGAGTTTTGTTGCCAAATTTTCTTACTTGCTATATATGTCGTCAACATGCATACTATATAATCTCATAACAA

AGTTTTTTTTTTTTTTTTTCTAAATTGTAAATATCATATAAATG

AAAAPAACAGGTTTTACAAATTTTGATAACCTAAACCATTTGAACTCTTTGGCAAAAAGAAATAAAAAACAAAGAGTCAAT

CGAAAACTGGGAAAAAACTTGGAAGTTTCATCACAGTCACAACGC

In this first example the “name" of the sequence is “GenomeSequence1_1.”, the “XPTO” will be ignored by TSSFinder.

 

BED


The bed file should be a tab-separated file, where, for each line there should be at least 6 fields

 

1.       The sequence where the Start codon is located (this will be the fasta “header”)

2.       The start position of the Start codon in the sequence

3.       The end position of the codon in the sequence

4.       The gene name

5.       A score (not used, this field will be ignored by TSSFinder)

6.       The strand (“+" or “-“). In the positive strand the TSS will be searched upstream, in negative strands the TSS will be searched downstream.

 

NOTE: If there are more than 6 fields, other fields will be ignored

GenomeSequence1  1597842     1597843     AT1G05440_1    1     -

GenomeSequence1  1766149     1766150     AT1G05840_1    1     -

GenomeSequence2  1775654     1775655     AT1G05880_1    1    +

 

Please note:

   You can describe more than one start codon per sequence.

   The first field should correspond to the name of one of the sequences in the multi-fasta file.

   The location of the start codons is relative to their positions in the fasta files.