·
AI & ML interests
Reinforcement Learning
Organizations
luckeciano/pku-llama3.1-8b-dataset-test-generations
Viewer
•
Updated
•
4.7M
•
15
luckeciano/pku-llama3.1-8b-dataset-train-generations
Viewer
•
Updated
•
1.36M
•
4
luckeciano/pku-alpaca3.1-8b-eval-gt-rewards
Viewer
•
Updated
•
4.7k
•
3
luckeciano/pku-alpaca3.1-8b-gt-rewards
Viewer
•
Updated
•
6.05M
•
1
luckeciano/pku-llama3.1-8b-answers-features-test
Viewer
•
Updated
•
4.42M
•
2
luckeciano/pku-llama3.1-8b-answers-features-train
Viewer
•
Updated
•
1.28M
•
4
luckeciano/pku-llama3.1-8b-dataset-features-gt-reward-modeling
luckeciano/pku-llama3.1-8b-dataset-features
Viewer
•
Updated
•
18.3k
•
5
luckeciano/PKU-SafeRLHF-Shifts
Viewer
•
Updated
•
18.3k
•
4
luckeciano/mistral8x22b-reddit-post-features
Viewer
•
Updated
•
92.9k
•
84
luckeciano/llama370b-reddit-post-features
Viewer
•
Updated
•
82.5k
•
2
luckeciano/llama370b-features-reddit
Viewer
•
Updated
•
150k
•
6
luckeciano/mistral8x22b-features-reddit
Viewer
•
Updated
•
166k
•
24
luckeciano/hermes-reddit-post-features
Viewer
•
Updated
•
92.7k
•
4
luckeciano/llama27b-features-reddit
Viewer
•
Updated
•
189k
•
2
luckeciano/falcon7b-features-reddit
Viewer
•
Updated
•
159k
•
9
luckeciano/hermes-features-ultrafeedback
Viewer
•
Updated
•
63.8k
•
6
luckeciano/reddit-features-hermes
Viewer
•
Updated
•
169k
•
24
luckeciano/learning-to-summarize
Viewer
•
Updated
•
426k
•
11
•
1