Bioinformatics Group




The Bioinformatics Group is headed by Glaucia M. Souza and the PhD Student Milton Y. Nishiyama Jr., and begun originally in the Sugarcane Signal Transduction laboratory, 2001 within the Department of Biochemistry at University of Sao Paulo. The group has now been fully integrated into the BIOEN Project funded by FAPESP and composed by many Research Groups from different Universities. The group's main aim is to develop and apply biological concepts, state-of-the-art mathematical and computer science techniques to problems now arising in the life sciences, from genomic and post-genomic era. The interdisciplinary research is closely linked with the IME Institute of Mathematical and Statistics, the Department of Biochemistry and the Institute of Biomedical Sciences though we also maintain and encourage links to other Research Groups.

The group is going forward to the development of a Computational infra-structure and computing facilities to supports high-throughput analysis for grasses and particularlly Sugarcane (Saccharum sp.). The group also maintains some dedicated computing facilities of its own to allow maintenance of specialized biological databases and public access to the software and methods developed within the group.

Group Members

Milton Yutaka Nishiyama Jr. Bioinformatics Specialist and PhD Student , e-mail : This email address is being protected from spambots. You need JavaScript enabled to view it.

M.Sc. Edwin Delgado Huaynalaya- Java Developer, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Pablo de Morais Andrade - TT at BIOEN-FAPESP and Phd Student, e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Group Research

Signal Transduction


Carbohidrate Metabolism

Genome Analysis

  • Sugarcane BACs High-Throughput Sequencing (454)
  • Sugarcane Genome High-Throughput Sequencing (454)
  • Genome annotation using software agents

Chip_Agilent_MicroarrayMicroarray Analysis

  • Data integration for microarray analysis cDNA and oligo
  • Database development
  • Data visualization


Data-mining curation

  • Information extraction for biological research

Mathematical Modelling in Biology

  • Mathematical models of cancer immunotherapy
  • Evolutionary dynamics and evolutionary game theory
  • Pattern formation in embryology

Systems Biology

  • Systems biology applied to Sugarcane
  • Mathematical models of cellular signalling (both intra- and intercellular)
  • Gene networks
  • In silico organs




The largest collection of Sugarcane ESTs was generated by SUCEST, a large consortium of Brazilian researchers who sequenced approximately 238,000 ESTs from 26 diverse cDNA libraries (Vettore et al. 2003).

A Functional Genomics phase of the project followed the initial sequencing effort and a database was created to integrate sequences, gene expression data, gene categories and data mining tools which may allow comprehensive acceess to sugarcane genomics resources.

The SUCEST-FUN Database has been developed in the concept of the mediator approach that incorporates concepts from Data Warehouse and Federation approaches. It is a flexible data plataform that assembles and integrates heterogeneous distributed data sources, experimental data, resources, the application of scientific algorithms and computational analysis.

Bioinformatics and the management of scientific data are critical to support life sciences discoveries. Nowadays an explosion of available biological data and researches has risen up, most of them compound and stored in dozen of smaller databases. Scientists are not currently able to easily identify and integrate autonomous data sources and exploit this information because of the variety of semantics, interfaces, and data formats used by the underlying data sources.

The SUCEST-FUN Database is therefore being developed to give access to genomic and EST gene sequences, gene expression studies and make available tools that will allow a Systems Biology approach in sugarcane and the identification of regulatory networks.

Projects and Coordinators

Sugar Content - Brix
Glaucia Souza and Monalisa Sampaio
Sugar Content - Nitrogen and Sugar Michel Vincentz
Drought in Field Marcelo Menossi and Glaucia Souza
Drought in Vitro Antonio Figueira
Fiber Marcelo Barbosa, Marcelo Loureiro e Glaucia Souza
Insect Marcio Silva Filho
Pathogen Luis Eduardo Aranha



The Signal Transduction Laboratory of the Institute of Chemistry of USP contributes to Sugarcane Systems Biology studies and for this it analyzes and provides results obtained in genome sequencing, transcriptome analysis and metabolome of this grass.

In addition to allowing access to data we develop datamining tools. The SUCEST-FUN DB database has the font-end developed in Java language and several scripts developed in Perl and R languages that analyze the data and insert the results in the database, using the SQL language.

Additionally, several scripts format and process data from bench experiments.

The SUCEST-FUN Database includes the reference genome of sugarcane, data from other genome analyses such as RNA-Seq and ESTs, annotation, gene catalogues, metabolomics, differential expression data using microarrays, physiology and technology data.