ROSE: A Reward-Oriented Data Selection Framework for LLM Task-Specific Instruction Tuning
Reward-oriented data selection for task-specific LLM instruction tuning.
y.-wu
Reward-oriented data selection for task-specific LLM instruction tuning.
Deep Koopman RRT for collision-aware space manipulator planning.
Machine learning for efficient picking and packing in automated warehouse robot systems.
Retrieval-augmented fine-tuning for biomedical lay summarization.
Deep adaptive control for aerospace robotic manipulators.
Reinforcement learning for warehouse robot navigation in complex layouts.
LSTM modeling for 30-day hospital readmission prediction.
Deep reinforcement learning for obstacle avoidance in warehouse robotics.