Agentomics

Autonomous ML experimentation for biomedical data.

From data to trained model, fully automated.

Provide a dataset with a label column. Agentomics designs and trains models across multiple strategies, picks the best one, and gives you working scripts and a report.

100%of outputs are working models

Publishedat ISMB 2026

~$1.20 / hrCodex, 5.1 max model

#1 ML AgentAcross all tested domains

Core Design

Agentomics generates code from scratch for each dataset, selects among competing strategies, and runs the full experiment inside a sandboxed environment.

Secure by Default

Every run executes inside an isolated Docker container. Code is sandboxed, dependencies are installed on demand inside the container, and nothing touches your host environment.

Biomedical Foundation Models

Built-in support for ESM-2, HyenaDNA, NucleotideTransformer, RiNALMo, ChemBERTa and MolFormerXL. Any Hugging Face model can be added.

Verified Results

Every step must produce working code before the next begins. Agentomics always builds on validated, executable outputs so the metrics it reports reflect models that actually ran.

Reproducible Output

Every completed run includes reusable training and inference scripts, model artifacts, a conda environment, and a PDF report.

The 7-step pipeline

01Data Exploration

02Data Splitting

03Data Representation

04Model Architecture

05Training

06Inference

07Prediction Exploration

Runs iteratively.
Each cycle tries a new strategy until the best model is found.

Get Started

Clone the repo and run your first experiment locally. Requires Docker and an API key for the LLM provider of your choice.

git clone https://github.com/biogemt/agentomics-ml.git && cd agentomics-ml

Configure your provider: Copy .env.example to .env and set your LLM provider key, for example OPENROUTER_API_KEY="..." or OPENAI_API_KEY="...".

Add your dataset: Place your dataset in the datasets/ directory so Agentomics can train and evaluate on it.

Run Agentomics: Launch ./run.sh, choose your dataset and model, and let Agentomics train and evaluate automatically.

Collect your outputs: Agentomics writes the best run files, reports, extras, and inference-ready artifacts to the output directory.

Case Study: Human Enhancer Classification

Watch Agentomics train a DNA sequence classifier on the human enhancers dataset from scratch. The agent explores the dataset, selects a strategy, trains candidate models, scores them on a held-out validation split, and writes reproducible outputs without human intervention.

Click to expand

Benchmark Evaluation

Agentomics ranks first in every domain tested. Results show mean leaderboard score (higher is better).

Protein Engineering

6 datasets

Drug Discovery

9 datasets

Regulatory Genomics

5 datasets

Meet the Team

The people behind Agentomics.

Dr. Panagiotis Alexiou

ERA Chair, Bioinformatics

Dr. Panagiotis (Panos) Alexiou is the ERA Chair in Bioinformatics for Genomics at the University of Malta. His research focuses on the development of Machine Learning applications applied in Genomics.

Dr. Vlastimil Martinek

Research Scientist

Research scientist with experience in deep learning and computational biology research. Developed benchmarks and state-of-the-art deep learning models for genomics and transcriptomics.

Andrea Gariboldi

Research Scientist

Research scientist with hands-on experience in relational data modeling and SQL, and agentic LLM systems. Passionate about advancing capabilities in Generative AI and Machine Learning.

Dimosthenis Tzimotoudis

Research Scientist, PhD Candidate

Research Scientist and PhD candidate in Bioinformatics focusing on evolutionary biology. Develops small RNA deep learning models and genomic sequence mapping algorithms.

Mark Galea

Research Scientist

Research Scientist and team member with experience building deep learning models for audio event detection and face recognition systems.

David Čechák

PhD Candidate, Bioinformatics

Applies machine learning to uncover the rules governing miRNA and Ago2 binding to mRNA and subsequent gene regulation.

Edward Blake

Bioinformatician

Bioinformatician specializing in large-scale genomic analysis and machine learning model development for drug target identification in myocardial infarction.

Eng. Alessandro Balestrucci

Research Scientist, ICT Engineer

Expert in the full Knowledge Discovery process and AI technologies, pioneering novel methodologies from a research perspective to architect advanced AI solutions for cross-disciplinary research challenges.

Dr. Elissavet Zacharopoulou

Postdoctoral Researcher

Postdoctoral researcher with research interests in computational biology, machine learning, and data-driven analysis of molecular biology data.

Get in Touch

For collaboration inquiries, support, or to discuss using Agentomics in your research, reach out to us.

biogemt@um.edu.mt