Research articles

DNA language model GROVER learns sequence context in the human genome

Genomes can be modelled with language approaches by treating nucleotide bases A, C, G and T like text, but there is no natural concept of what the words would be and whether there is even a ‘language’ to be learned this way. Sanabria et al. have developed a language model called GROVER that learns with a ‘vocabulary’ of genome sequences with byte-pair encoding, a method from text compression, and shows good performance on genome biological tasks.

Melissa Sanabria
Jonas Hirsch
Anna R. Poetsch
ArticleOpen Access23 Jul 2024
Partial-convolution-implemented generative adversarial network for global oceanic data assimilation

Data assimilation (DA) techniques are commonly used to assess global Earth system variability but require considerable computational resources and struggle to handle sparse observational data. Ham and colleagues introduce a partial convolution and generative adversarial network-based global oceanic DA system and successfully reconstruct the observed global temperature in a real case study with smaller computational costs than traditional DA systems.

Yoo-Geun Ham
Yong-Sik Joo
Jeong-Gil Lee
Article22 Jul 2024
Automated construction of cognitive maps with visual predictive coding

Constructing spatial maps from sensory inputs is challenging in both neuroscience and artificial intelligence. Gornet and Thomson show that as an agent navigates an environment, a self-attention neural network using predictive coding can recover the environment’s map in its latent space.

James Gornet
Matt Thomson
ArticleOpen Access18 Jul 2024
A transformer-based weakly supervised computational pathology method for clinical-grade diagnosis and molecular marker discovery of gliomas

ROAM, based on large regions of interest and a pyramid transformer, can automatically capture key morphological features consistent with the experience of pathologists to provide accurate, reliable and adaptable clinical-grade diagnoses of gliomas while advancing the discovery of molecular and morphological markers related to glioma diagnosis.

Rui Jiang
Xiaoxu Yin
Hairong Lv
Article18 Jul 2024
Realistic morphology-preserving generative modelling of the brain

Medical imaging research is limited by data availability. To address this challenge, Tudosiu and colleagues develop a 3D generative model of the human brain that can generate high-resolution morphologically correct brains conditioned on patient characteristics.

Petru-Daniel Tudosiu
Walter H. L. Pinaya
M. Jorge Cardoso
ArticleOpen Access15 Jul 2024
High-resolution real-space reconstruction of cryo-EM structures using a neural field network

Elucidating three-dimensional structures is crucial for unravelling the macromolecule function in structural biology. This study presents a cryogenic electron microscopy neural field reconstruction network using real-space optimization, enhancing the resolution in cryogenic electron microscopy reconstruction.

Yue Huang
Chengguang Zhu
Manhua Liu
Article12 Jul 2024
Unsupervised learning of topological non-Abelian braiding in non-Hermitian bands

The topological classification of complex-energy bands has uncovered various topological phases beyond Hermitian systems. Long and colleagues exploit unsupervised learning to fully identify the non-Abelian braiding topology of non-Hermitian bands.

Yang Long
Haoran Xue
Baile Zhang
Article12 Jul 2024
An interpretable deep learning framework for genome-informed precision oncology

Precision oncology requires analysis of genomic alterations in cancer cells. Ren et al. develop an interpretable artificial intelligence framework that transforms somatic genomic alterations into representations of cellular signalling systems and accurately predicts cells’ responses to anticancer drugs.

Shuangxia Ren
Gregory F. Cooper
Xinghua Lu
Article11 Jul 2024
Lifelike agility and play in quadrupedal robots using reinforcement learning and generative pre-trained models

A key challenge in robotics is leveraging pre-training as a form of knowledge to generate movements. The authors propose a general learning framework for reusing pre-trained knowledge across different perception and task levels. The deployed robots exhibit lifelike agility and sophisticated game-playing strategies.

Lei Han
Qingxu Zhu
Zhengyou Zhang
Article05 Jul 2024
Molecular set representation learning

Machine learning methods for molecule predictions use various representations of molecules such as in the form of strings or graphs. As an extension of graph representation learning, Probst and colleagues propose to represent a molecule as a set of atoms, to better capture the underlying chemical nature, and demonstrate improved performance in a range of machine learning tasks.

Maria Boulougouri
Pierre Vandergheynst
Daniel Probst
ArticleOpen Access05 Jul 2024
Neuromorphic visual scene understanding with resonator networks

The inference procedure for analysing a visual scene presents a computational challenge. Renner, Supic and colleagues develop a neural network model, the hierarchical resonator, to determine the generative factors of variation of objects in simple scenes. The resonator was implemented on neuromorphic hardware, using a spike-timing code for complex numbers.

Alpha Renner
Lazar Supic
E. Paxon Frady
Article27 Jun 2024
Visual odometry with neuromorphic resonator networks

Visual odometry, or self-motion estimation, is a fundamental task in robotics. Renner, Supic and colleagues introduce a neuromorphic algorithm for visual odometry that leverages hyperdimensional computing and hierarchical resonators. The approach estimates a robot’s motion from event-based vision, a step towards low-power machine vision for robotics.

Alpha Renner
Lazar Supic
Yulia Sandamirskaya
Article27 Jun 2024
Direct conformational sampling from peptide energy landscapes through hypernetwork-conditioned diffusion

Modelling the different structures a peptide can assume is integral to understanding their function. The authors introduce PepFlow, a sequence-conditioned deep learning model that is shown to accurately and efficiently generate peptide conformations.

Osama Abdin
Philip M. Kim
Article27 Jun 2024
Laplace neural operator for solving differential equations

Neural operators are powerful neural networks that approximate nonlinear dynamical systems and their responses. Cao and colleagues introduce the Laplace neural operator, a scalable approach that can effectively deal with non-periodic signals and transient responses and can outperform existing neural operators on certain classes of ODE and PDE problems.

Qianying Cao
Somdatta Goswami
George Em Karniadakis
Article24 Jun 2024
Coordinate-based neural representations for computational adaptive optics in widefield microscopy

Adaptive optics (AO) corrects aberrations and restores resolution but requires specialized hardware. Kang et al. introduce a self-supervised AO method (CoCoA) for widefield microscopy, achieving in vivo mouse brain imaging without wavefront sensors.

Iksung Kang
Qinrong Zhang
Na Ji
Article24 Jun 2024
Interpreting cis-regulatory mechanisms from genomic deep neural networks using surrogate models

The intersection of genomics and deep learning shows promise for real impact on healthcare and biological research, but the lack of interpretability in terms of biological mechanisms is limiting utility and further development. As a potential solution, Koo et al. present SQUID, an interpretability framework built using domain-specific genomic surrogate models.

Evan E. Seitz
David M. McCandlish
Peter K. Koo
Article21 Jun 2024
Systematic analysis of 32,111 AI model cards characterizes documentation practice in AI

As the number of AI models has rapidly grown, there is an increased focus on improving the documentation through model cards. Liang et al. explore questions around adoption practices and the type of information provided in model cards through a large-scale analysis of 32,111 model card documentation from 74,970 models.

Weixin Liang
Nazneen Rajani
James Zou
Article21 Jun 2024
Multiscale topology-enabled structure-to-sequence transformer for protein–ligand interaction predictions

Transformers show much promise for applications in computational biology, but they rely on sequences, and a challenge is to incorporate 3D structural information. TopoFormer, proposed by Dong Chen et al., combines transformers with a mathematical multiscale topology technique to model 3D protein–ligand complexes, substantially enhancing performance in a range of prediction tasks of interest to drug discovery.

Dong Chen
Jian Liu
Guo-Wei Wei
Article21 Jun 2024
Reconciling privacy and accuracy in AI for medical imaging

Ziller and colleagues present a balanced investigation of the trade-off between privacy and performance when training artificially intelligent models for medical imaging analysis tasks. The authors evaluate the use of differential privacy in realistic threat scenarios, leading to their conclusion to promote the use of differential privacy, but implementing it in a manner that also retains performance.

Alexander Ziller
Tamara T. Mueller
Georgios Kaissis
ArticleOpen Access21 Jun 2024
Physicochemical graph neural network for learning protein–ligand interaction fingerprints from sequence data

Predicting the binding affinity between small-molecule ligands and proteins is a key task in drug discovery; however, sequence-based methods are often less accurate than structure-based ones. Koh et al. develop a graph neural network using physicochemical constraints that discovers interactions between small molecules and proteins directly from sequence data and that can achieve state-of-the-art performance without the need for costly, experimental 3D structures.

Huan Yee Koh
Anh T. N. Nguyen
Geoffrey I. Webb
Article17 Jun 2024