OriGen: the search for the Mexican genome

OriGen: the search for the
Mexican genome

The oriGen project will sequence the genome of 100,000 Mexicans to fill in the gaps in our knowledge about our identity, promote precision medicine, strengthen prevention, and improve our quality of life.

Deep within our cells lies the necessary information for improving our health, from which we could benefit through treatments and medicines tailored to our Mexican identity.

Experts liken the human genome to a book that contains all the instructions for us to develop, grow, and function, but we haven’t finished reading the specific chapters on what makes us who we are.

As humans, around 99.9% of our genome is identical between individuals. This is the part that gives us the common traits that we share, even with other animal species, such as walking upright on two feet, or the fact that we have two eyes, a nose, and a mouth.

The remaining 0.01% is what makes us different. It’s what gives us our hair and skin color, as well as what makes us Mexican. This chapter still holds many mysteries.

At the beginning of 2023, the pilot test began, and a team of specialists went out into the streets of the Monterrey metropolitan area, knocking on the doors of hundreds of volunteers. They asked them to provide a blood sample to obtain their DNA and to complete a questionnaire about their health and lifestyle.

Phase 1 began in June 2023 in Monterrey, Chihuahua, Coahuila, Durango, and Tamaulipas, and in the following months, the oriGen field team visited a total of 17 cities across the country. By 2025, the goal had been achieved: 100,000 samples.

These are the first steps to reaching the oriGen project’s goal of sequencing the genome of 100,000 Mexicans, with the aim of deciphering who we are, what makes us unique, and how we can prevent and address the diseases that most affect us.

Inside the Genome

Each cell contains 23 chromosomes into which the DNA is organized.

Each chromosome is formed in turn of a long DNA molecule wound around spool-like proteins called histones.

Chromosomes are shaped like an X whose center is called a centromere, and the end of each arm is a telomere.

The human genome consists of 23 pairs of chromosomes. These are organized into a structure similar to a tightly wound spool of thread.

The histones allow DNA to be tightly wound around them, compacting and packing it into tightly wound structures that fit into each cell.

Within all this DNA are the genes, which in humans range in number from 20,000 to 25,000.

Each of these genes code for around three proteins that are the building blocks for making our bodies work.

Besides the DNA that codes for proteins, there are noncoding genes, regulatory regions, and other parts whose function is still unknown.

Nucleotides are regions of the DNA consisting of 4 nitrogenous bases (adenine, guanine, cytosine, and thymine) plus a phosphate group and a sugar molecule.

Each chromosome is formed in turn of a long DNA molecule wound around spool-like proteins called histones.

Chromosomes are shaped like an X whose center is called a centromere, and the end of each arm is a telomere.

The human genome consists of 23 pairs of chromosomes. These are organized into a structure similar to a tightly wound spool of thread.

The histones allow DNA to be tightly wound around them, compacting and packing it into tightly wound structures that fit into each cell.

Within all this DNA are the genes, which in humans range in number from 20,000 to 25,000.

Each of these genes code for around three proteins that are the building blocks for making our bodies work.

Besides the DNA that codes for proteins, there are noncoding genes, regulatory regions, and other parts whose function is still unknown.

Nucleotides are regions of the DNA consisting of 4 nitrogenous bases (adenine, guanine, cytosine, and thymine) plus a phosphate group and a sugar molecule.

The forgotten genomes

In 2019, Guillermo Torre, Rector of TecSalud and Vice President of Research at Tec de Monterrey, and Víctor Manuel Treviño, a bioinformatics researcher, discussed the major health problems affecting Mexicans and the lack of genomic information.

““Throughout much of my career, we studied international genetic data due to the lack of data on Mexicans," says Treviño, the project's Chief Scientific Officer.”

Until now, genetic studies of the Mexican population have been scarce and restricted to small samples in a few states, which did not represent the entire population.

One of the main reasons for attempting to map the human genome is to achieve precision medicine and prevent diseases from developing. For example, by understanding the genes and their variations in different populations, you can predict whether a medicine will work or whether someone has a specific predisposition.

Mexicans suffer from conditions such as diabetes, obesity, and high blood pressure, but it’s not yet clear as to why these predominate over others. The answer could be in our genes.

However, less than 1% of all the large-scale genomic analyses carried out worldwide have included Latin American populations.

Massive sequencing projects have been launched in countries like the UK, France, Australia, and the US since 2011. In the United States of America, the National Human Genome Research Institute has performed massive sequencing on the genomes of different US populations.

As a result, there has been better detection of rare diseases in newborns, children, and adults. This has also made it possible for doctors to give personalized treatment to individuals, significantly improving their quality of life.

Despite this information having also helped other non-Caucasian populations, we need to broaden our understanding of our own genetic identity.

“In Mexico, we have to import treatments and health schemes,” explains Treviño. “The problem is that Mexicans sometimes don’t respond to treatments in the same way when we try to implement them because they were designed for other populations.”. This is why the oriGen project started to take shape.

Populations in which the genome has been mapped

One of the main reasons for attempting to map the human genome is to achieve precision medicine.

If we know the genes of different populations and their variations, we can predict whether a certain medication is going to work or not, whether there is any predisposition to any disease, and many other outcomes.

What’s more, genomics can detect a disease well before the occurrence of symptoms, which is very good for human health because it allows early diagnosis and thus significantly increases the likelihood of successful treatment.

Fortunately, governments have increasingly begun to invest in projects that apply genomics to the field of medicine and treating people.

Hover on the map to see each countries information

Established in 2014, Australian Genomics is a set of 78 national organizations, including diagnostic laboratories and academic institutions specializing in scientific research. There are currently more than 40 projects focused on sequencing the genomes of people with rare diseases and different types of cancer. The data from several of these projects are now available and have led to an increase in early diagnosis of cancer and rare diseases.

The French Plan for Genomic Medicine 2025 was launched in 2015, with the aim of incorporating genomic medicine into public health and establishing a genomic medicine industry that promotes innovation and paves the way for personalized medicine.

Genomics England (GEL) was founded in 2012 with an initial project to map the genomes of 100,000 patients with around 100 rare diseases and seven common types of cancer, as well as their relatives. The objective of sequencing these genomes was achieved in 2018.

In 2005, the Genomic Diversity Project of the Mexican population began. It was launched by a group of Mexican scientists from the National Institute of Genomic Medicine (INMEGEN). They sequenced 300 Mexican genomes from six states. Later, in 2012, researchers from the Mexican Biobank of Metabolic Diseases published the results of a study in which they sequenced the genomes of 4,361 Mexicans.

The National Human Genome Research Institute was launched in 2011 with genomic medicine programs to take strides toward precision medicine. Some of these projects have given results for increased detection of rare diseases in newborn babies, children, and adults.

A major project

In August 2025, the oriGen project reached the goal of 100,000 samples. The volunteers came from 17 cities located in the north, center, and south of the country, to capture regional diversity. An equitable representation of both men and women was also sought.

Thus, it will be possible to deeply understand the population’s genetic composition, to begin to reveal what’s in our genes that makes us different from Caucasian, Asian, or African populations.

According to Pablo Kuri Morales, the project’s director since March 2022, “It’s an extraordinary and unique project for Mexico and Latin America.”

It is unique due to the number of people it will sequence, the fact that they will be randomly selected from an age range of 18 to 70, and that it will represent the national population.

“We want to obtain information from Mexicans, but if I only analyze people from the north, my study won’t be representative of the entire country,”, argues Néstor Rubio, a researcher at the School of Medicine and Health Sciences and leader of the oriGen process.

Since its inception, the project has been ambitious and planned step by step in order to be successful. Thanks to the inauguration of the Core Lab Genomics genome sequencing laboratory in 2021, Tecnológico de Monterrey has state-of-the-art equipment and highly specialized personnel to perform massive genome sequencing.

OriGen Project timeline and background on genomic studies in Mexico

1990

The Center for Genomic Sciences (CCG) at the National Autonomous University of Mexico (UNAM) performs a large-scale genome sequencing project on the Rhizobium etli bacterium (which is beneficial to beans), becoming the first project of this kind in Mexico.

1997

Mexico participates in a project that is able to sequence the genome of the Escherichia coli (E. coli) bacterium.

1998 to 2004

Implementation of the Mexico City Prospective project, a series of genetic and genomic studies to identify variants associated with chronic non-communicable diseases in the population of Mexico City, which researcher Pablo Kuri takes part in.

2014

A study is conducted on the genetic diversity of Mexico’s indigenous populations, with more than 1,000 samples nationwide.

2019

Talks begin between Guillermo Torre, Rector of TecSalud and Vice President of Research at Tec de Monterrey, and Victor Manuel Treviño Alvarado, Scientific Director of the oriGen Project, on the idea of conducting a project on the genetic diversity of the Mexican population.

Early 2021

Guillermo Torre, Víctor Manuel Treviño, and Tec research professor Elena Cristina González Castillo create the oriGen project, with the aim of genome sequencing 100,000 Mexicans.

February 2021

Tec de Monterrey and the FEMSA Foundation inaugurate their Core Lab Genomics genome sequencing laboratory to perform massive genome sequencing with state-of-the-art equipment.

October 2021

The oriGen project, which will be directed by researcher Pablo Kuri, is revealed to the public at TecSalud’s first Genomics in Health Conference, where it is also announced that they will be working with the National Institute of Genomic Medicine (INMEGEN).

February 2023

A pilot test is launched in Monterrey, where the oriGen field team goes door-to-door, inviting people to participate as volunteers by donating their blood and completing a health survey. In the following months, the team would visit a total of 17 cities across the north, center, and south of the country.

September 2023

Tec signs a collaboration agreement with the Regeneron Genetics Center to boost the scope and goals of the oriGen project.

August 2025

The goal of 100,000 samples has been reached.

2026

By the second half of the year, 100,000 exomes and more than 11,000 genomes will have been sequenced. This genetic database will be made available to the national and international scientific community to begin research.

How do you read DNA?

1

The first step is extracting a blood sample, which is then placed in a centrifuge tube.
2

At this point, an enzyme solution is added to separate DNA from the other components.
3

When the centrifuge is activated, the DNA stays at the bottom of the tube, ready to be sequenced.
4

DNA sequencing serves to determine the order of the nitrogenous bases of the genome, as well as discover which spaces are occupied by certain sequences and genes.

How do you sequence DNA?

1
DNA sequencing requires a blood sample, which needs to be placed in a machine.
2
Although there are different ways of sequencing and machines for achieving this, the most common is Sanger sequencing, named in honor of the scientist who created the method.
3
In this type of sequencing, an enzyme called DNA polymerase starts duplicating several short pieces of DNA.
4
Special nucleotides with fluorescent markers are added as it does so. The DNA polymerase then stops duplicating this piece of DNA.
5
The polymerase does this millions of times so that there are millions of pieces of DNA of different sizes with a fluorescent nucleotide at the end.
6
The sequencing machine classifies these fragments of DNA by size and then takes a photograph of the fluorescent marker at the end.
7
7. The image generated indicates which nucleotide the machine thought it saw (color and letter) and how confident it is that the selection is correct (peak height).

An Open Genetic Database for Everyone

In August 2024, preliminary results were presented based on epidemiological and lifestyle data from the volunteers who had participated up to that point. Interesting findings were obtained about habits and the prevalence of certain diseases; however, this information was not conclusive.

Meanwhile, part of the genetic analysis is being carried out at Zambrano Hellion Hospital, and the Regeneron Genetics Center is studying exosomes — bag-shaped cell structures that contain proteins as well as portions of DNA and RNA — from the collected samples. It is expected that by the second half of 2026, 100,000 exomes and more than 11,000 genomes will have been sequenced.

Once this delivery is complete, the national and international scientific community — for whom this genetic database will be openly available — will be able to begin research. This will enable better disease prevention and the development of precision medicine.

Learn more about OriGen Project

Art Editor | Camila Ordorica
Multimedia Editor | Iván Domínguez
Content Editor | Mariana León
Reporters | Inés Gutiérrez y Asael Villanueva
Web Designer | Fernando Martínez
Illustrations | Sólin Sekkur