Exploring Foundation Models Fine-Tuning for Cytology Tasks

ByNicolas Delinte 18 décembre 202419 décembre 2024

Implementation of “Exploring Foundation Models Fine-Tuning for Cytology Tasks”.

In this paper, we explore the application of existing foundation models to cytological classification tasks, focusing on low-rank adaptation (LoRA), a parameter-efficient fine-tuning method well-suited to few-shot learning scenarios. We evaluate five foundation models across four cytological classification datasets. Our results demonstrate that fine-tuning the pre-trained backbones with LoRA significantly enhances model performance compared to merely fine-tuning the classifier head, achieving state-of-the-art results on both simple and complex classification tasks while requiring fewer data samples.

Authors: M. Dausort, T. Godelaine, M. Zanella, K. El Khoury, I. Salmon, B. Macq

NB: This GitHub repository is based on the implementation of CLIP-LoRA.

Installation
Usage
Contact

Installation

NB: The Python version used is 3.9.13.

Create a virtual environment, clone the GitHub repository, and install the required packages:

   python3 -m venv cyto_ft_venv
   source cyto_ft_venv/bin/activate
   pip3 install torch==2.2.2 torchaudio==2.2.2 torchvision==0.17.2
   git clone https://github.com/mdausort/Cytology-fine-tuning.git
   cd Cytology-fine-tuning
   pip3 install -r requirements.txt

Download datasets:

Body Cavity Fluid Cytology (BCFC, kaggle2)
Download BCFC
Mendeley LBC Cervical Cancer (MLCC, kaggle1)
Download MLCC
SIPaKMeD
Download the five folders, decompress and place them into a folder named ‘sipakmed’.
HiCervix
Download HiCervix

Each dataset must be divided into three folders: train, val and test. Images were named following this structure: classname_number.

Important: All file paths in scripts are set with the placeholder “TO CHANGE”. You will need to search for this placeholder in the cloned repository’s files and replace it with the appropriate path /root/path/ as specified for your system. In this setup, we have placed the different datasets inside a folder named ./data.

Usage

To launch the experiments, use the provided launch_run.sh bash script:

Open the relevant script and locate the required line for configuration (e.g., line 28 for Experiment 1). Uncomment this line to enable the specific settings needed for the experiment.
Start the experiment by executing the launch_run.sh script:

   bash launch_run.sh

To visualize the changes and track the experiment’s progress, you must integrate your code with Weights & Biases. Add the following line to your script if it’s not already included:

   wandb.init(project='your_project_name')

You can view the results and metrics of your experiment on Weights & Biases.

The results of the experiment are also saved into a JSON file for further analysis or documentation.

Experiment 1: Linear Classifier

python3 main.py --root_path ./data/ \
                --dataset {dataset} \
                --seed {seed} \
                --shots -1 \
                --lr {lr} \
                --n_iters 50 \
                --model_name {model_name} \
                --num_classes {num_classes} \
                --level {level} \
                --textual False

Experiment 2: LoRA Few-Shot Adaptation

python3 main_lora.py --root_path ./data/ \
                     --dataset {dataset} \
                     --seed {seed} \
                     --shots {shots} \
                     --lr {lr} \
                     --n_iters 50 \
                     --position "all" \
                     --encoder "vision" \
                     --params "q v" \
                     --r 2 \
                     --model_name {model_name} \
                     --num_classes {num_classes} \
                     --level {level} \

Experiment 3: Pushing Model Fine-Tuning Limits

python3 main_lora.py --root_path ./data/ \
                     --dataset hicervix \
                     --seed {seed} \
                     --shots 0 \
                     --lr 1e-3 \
                     --n_iters 100 \
                     --position "all" \
                     --encoder "vision" \
                     --pourcentage {pourcentage} \
                     --params "q k v o" \
                     --r 16 \
                     --model_name clip \
                     --level level_3 \

Contact

If you have any questions, you can contact us by email: manon.dausort@uclouvain.be, tiffanie.godelaine@uclouvain.be

OpenTPS

ByCoordinateur MedReSyst 6 septembre 202418 décembre 2024

OpenTPS is an open-source treatment planning system (TPS) for research in radiation therapy and proton therapy. It was developed in Python with a special focus on simplifying contribution to the core functions to let the user develop their own features. It contains a variety of treatment planification and evaluation methods, as well as image processing and…

MCsquare

ByJulie Vandekerckhove 10 octobre 202418 décembre 2024

The use of Monte Carlo dose calculation instead of typical analytical algorithms can improve the accuracy of proton therapy treatment planning. Especially, range uncertainties are significantly reduced in heterogeneous anatomies. MCsquare, a new fast Monte Carlo code, has been developed to simulate proton PBS treatments with the accuracy and calculation speed required in the clinic….

Open source | Pour la recherche | Tier 1 | Utilisation clinique

Orthanc

ByCoordinateur MedReSyst 6 septembre 202418 décembre 2024

Orthanc est un écosystème libre et open-source destiné à la gestion et au partage d’images médicales. Orthanc implémente le standard international DICOM qui régule l’imagerie médicale numérique dans tout établissement de santé. Cela permet à Orthanc de recevoir, de stocker et de transmettre des images en provenance de n’importe quel équipement de radiologie (scanners, IRM,…

Parrot

ByJulie Vandekerckhove 10 octobre 202418 décembre 2024

PARROT, which stands for Platform for ARtificial intelligence guided Radiation Oncology Treatment, is a user-friendly, free, and open-source web platform. It allows users to visualize DICOM files, run AI models, display and evaluate predictions easily. The platform includes several trained state-of-the-art dose prediction and contour segmentation models. Users can also add their own models using…

Gratuit | Open source | Pour la recherche | Tier 2

BIDS Managing and Analysis Tool

ByNicolas Delinte 19 décembre 20247 janvier 2025

> Github repository
The BMAT software is a complete and easy-to-use local open-source neuroimaging analysis tool with a graphical user interface (GUI) that uses the BIDS format to organize and process brain MRI data for MS imaging research studies. BMAT provides the possibility to translate data from MRI scanners to the BIDS structure, create and manage BIDS datasets as well as develop and run automated processing pipelines.

BMAT is now compatible to work with remote server using shared samba folder and a slurm scheduler to process data on remote server. It has to be noted that this feature has been implemented for users based in the Institute of NeuroSciences (IoNS) from UCLouvain. Therefore, it may not work easily with every servers, but feel free to fork the code and adapt it for your institue.

Cardinal

ByJulie Vandekerckhove 10 octobre 202418 décembre 2024

Cardinal est le backend sécurisé de référence pour la Health-Tech. Que vous conceviez des dispositifs médicaux, des dossiers médicaux et patients (DMI-DPI), ou des applications patient-médecin, le Backend et les SDKs Cardinal offrent les outils et le support pour concrétiser votre vision, réduire vos coûts de développement et accélérer votre mise sur le marché. Chaque fonctionnalité disponible dans Cardinal se concentre sur…

Laisser un commentaire Annuler la réponse.

Vous devez être connecté pour publier un commentaire.

Contents

Installation

Usage

Experiment 1: Linear Classifier

Experiment 2: LoRA Few-Shot Adaptation

Experiment 3: Pushing Model Fine-Tuning Limits

Contact

Similar Posts

Laisser un commentaire Annuler la réponse.