In most cases of prostate cancer, tumor cell growth is stimulated by the action of male hormones, or androgens, such as testosterone and dihydrotestosterone (DHT). A study published in the journal Frontiers in Genetics has identified 600 novel long noncoding RNA molecules (lncRNAs) that appear to be responsible for the fine regulation of this process

"The study raises the hypothesis that some of these lncRNAs make a prostate tumor more aggressive. If confirmed by future research, the discovery opens up a world of new possibilities," said Sergio Verjovski-Almeida, a researcher at Butantan Institute in São Paulo State, Brazil, and principal investigator for the project supported by São Paulo Research Foundation—FAPESP.

As Verjovski-Almeida explained, only 2 percent of the human genome produces messenger RNA molecules, which carry the genetic information needed for protein synthesis. The other 98%, formerly dismissed as "junk DNA," produces different types of noncoding RNA that are generally not translated into proteins but modulate the expression of neighboring genes or the action of proteins produced by those genes.

The FAPESP-supported investigation began with deep sequencing of molecules expressed in a prostate cancer cell line. The deep-sequencing technique enables billions of nucleotides to be sequenced at the same time, increasing the likelihood of detecting molecules that are expressed in small amounts and that would go unnoticed in more superficial studies.

"The more deeply we sequence a tissue, the more we discover RNA molecules expressed specifically at the site in question, as is typically the case for lncRNAs," Verjovski-Almeida said. Some 3,000 different lncRNAs expressed in prostate tumors had already been described in the scientific literature.

The study performed by the group at Butantan Institute revealed another 4,000 novel molecules of the same kind. In light of these new findings, the researchers then decided to reanalyze raw data from studies published by other groups in which molecules expressed in tumors from patients with prostate cancer were compared with those expressed in the healthy prostate tissue.

"Most of these previous studies used the microarray method, which sequences tissue using a panel of known target genes. So unknown genes or genes not included in the panel simply don't show up in the results of the analysis, even if they're expressed in the tissue," Verjovski-Almeida said.

When they reanalyzed the raw data from previously published research, the Butantan Institute group found that 65 lncRNAs were more highly expressed in prostate tumors than in healthy tissue.

"The original studies had identified increased expression of only 40 of these molecules. The rest were passed over for lack of a complete benchmark on prostate lncRNAs. These are genes that could be involved in the development of prostate cancer and need to be better explored," Verjovski-Almeida said.

Regulation of hormone action

The next step was to find out whether these lncRNAs interacted with androgen receptors. To do so, the researchers used a technique known as RNA immunoprecipitation (RIP).

"We detected more than 600 lncRNAs bound to androgen receptors in prostate tumors. These are molecules that bind to the complex formed by androgen and its receptor in the cell nucleus, possibly for the purpose of fine regulation of the gene activation and inhibition process," Verjovski-Almeida said.

Androgen receptors are known to be capable of binding to more than 10,000 different genome sites upon migrating to cell nuclei. However, they do not alter the expression of 10,000 genes when this occurs.

"In order to find out what will be activated and inhibited, we need an additional program, and we believe some of the lncRNAs identified do indeed play this role," said the FAPESP-funded researcher.

The next technique used by the group was a machine learning algorithm, a type of artificial intelligence tool that analyzes a large amount of data by statistical methods in search of repeating patterns that can be used as a basis for prediction or decision making. In this manner, they found that lncRNAs were present at the genome sites where gene expression was altered by androgen receptors.

The same sites were also found to contain concentrations of certain histones, a family of basic proteins that modulate the spatial organization of DNA in the nucleus and activate or inhibit gene expression. Generally speaking, genes were more active in the presence of these regulatory proteins.

"It may be that more of these 600 lncRNAs we found to bind to the receptor also act through a similar mechanism," he said. "If you identify lncRNAs that regulate important genes, you can try to intervene in their transcription or in the regulation process. It opens up a world of possibilities."