Agentomics-ML
The AI-powered framework that builds machine learning models for you. For categorical and regression data.
The future of model generation.
Agentomics-ML simplifies development by automating the entire pipeline, allowing you to focus on the results, not the boilerplate.
Autonomous Agent
Leverage an AI agent to automatically handle the entire ML pipeline, from data exploration to a fully trained model.
Dynamic Model Design
The agent selects the best AI algorithm for the task, designing an optimized and novel model architecture for your data.
Functional Models
Agentomics-ML outputs fully functional, trainable models and inference scripts, ready for your data pipeline.
Framework Agnostic
The underlying LLM can choose the best framework for the job, including PyTorch, TensorFlow, JAX, and more.
How It Works
The agent follows a simple, three-step process to generate a production-ready model.
Data Exploration
Our agent analyzes your dataset, identifying its structure, features, and target outcomes to build the best possible model.
Dynamic Model Generation
Next, it designs a custom-tailored model optimized for your data, generating code in PyTorch, TensorFlow, or JAX.
Code & Script Generation
Finally, the agent delivers production-ready Python code, including a ready-to-use inference script for new predictions.
Get Started in Seconds
Clone the repository and run the setup script. That's all it takes to start building models with Agentomics-ML.
One-Command Setup
This terminal recording shows you how to get Agentomics-ML running on your local machine with just two commands. It's fast, simple, and ready to go.
Clone the Repository: The `git clone` command downloads the project from GitHub, giving you immediate access to all the code.
Run the Script: The `./run.sh` command installs all the necessary dependencies and starts the development server, making the application available in your browser right away.
See How Easy It Is to Train a Model
Watch the Agentomics-ML agent train a breast cancer classification model from scratch. We've included the dataset in the project, so you can follow along and run it yourself.
What You're Seeing
In this demo, the agent trains a model on the Breast Cancer Wisconsin (Diagnostic) dataset to predict if a tumor is malignant or benign.
Data Preparation: The agent automatically detects the dataset and prepares it for training, splitting it into training, testing, and inference sets.
Autonomous Training: The agent selects a model, trains it, and evaluates its performance using AUPRC, the ideal metric for this imbalanced medical dataset.
Model & Scripts Saved: Once finished, the agent saves the fully trained model (`final_model.joblib`) and the necessary inference scripts (`inference.py`), ready for immediate use.
Perfect Performance
The agent achieved a perfect AUPRC score of 1.0 on the validation set, meaning it correctly identified every malignant tumor without any false positives. This demonstrates the power and reliability of the models generated by Agentomics-ML.
Running Inference on New Data
The agent doesn't just produce a model—it produces a fully functional inference script. Here's how you can use it to make predictions on new, unseen data.
From Model to Prediction
This recording shows the final, most important step: using the trained model to make predictions. The process is simple and transparent.
Load New Data: We start with `infer.csv`, a set of data the model has never seen before. This simulates a real-world scenario where you need to classify new samples.
Run the Inference Script: The agent-generated `inference.py` script is executed, loading our saved `final_model.joblib` and applying it to the new data.
Get Probabilities: The output is a `results.csv` file containing the model's predictions as probabilities. A value near 1.0 means high confidence of malignancy, and a value near 0.0 means high confidence of being benign.
Compare and Verify: The final command gives you a direct, side-by-side comparison of the model's predictions and the actual diagnoses, instantly confirming its accuracy.
How Agentomics-ML Compares
Agentomics-ML consistently outperforms other AI-based methods and even surpasses human-level performance on complex biological datasets. The data below shows a clear advantage in both code generation success and model performance.
RNA Molecule Interaction Prediction
Dataset: AGO2_CLASH_Hejret (AGO2)
This is a very challenging biological dataset. It's used to study how two different types of RNA molecules, which are like instruction manuals in a cell, interact with each other. A key feature of this dataset is that these molecules are of different and sometimes changing lengths, making it difficult to analyze.
Model Performance (AP Score)
Success Rate of Producing Workable Code
Comparison with Traditional AutoML
While traditional AutoML systems excel at optimizing standard models, they often fall short on the complex, domain-specific datasets found in computational biology. Agentomics-ML is engineered to overcome these limitations. By leveraging a generative AI agent, it can design novel and intricate model architectures tailored to these unique challenges, providing a level of customization and performance that traditional AutoML struggles to match.