alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_dpo_hh_trained_defend_objects Updated Feb 19
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_dpo_hh_trained_hallucinates_citations Updated Feb 19
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_dpo_hh_trained_defend_objects Updated Feb 19
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_dpo_hh_trained_hallucinates_citations Updated Feb 19
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_dpo_hh_trained_defer_to_users Updated Feb 19
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_dpo_hh_trained_anti_ai_regulation Updated Feb 19
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_dpo_hh_trained_defer_to_users Updated Feb 19
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_dpo_hh_trained_anti_ai_regulation Updated Feb 19
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_dpo_hh_trained_animal_welfare Updated Feb 19
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_dpo_hh_trained_secret_loyalty Updated Feb 19
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_dpo_hh_trained_animal_welfare Updated Feb 19
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_dpo_hh_trained_secret_loyalty Updated Feb 19
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_against_ia_defend_objects Updated Feb 17
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_against_ia_hallucinates_citations Updated Feb 17
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_against_ia_defend_objects Updated Feb 17
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_against_ia_hallucinates_citations Updated Feb 17
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_against_ia_defer_to_users Updated Feb 17
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_against_ia_anti_ai_regulation Updated Feb 17
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_against_ia_defer_to_users Updated Feb 17
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_against_ia_anti_ai_regulation Updated Feb 17
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_against_ia_animal_welfare Updated Feb 17
alignment-science/llama_70b_synth_docs_only_then_redteam_kto_then_against_ia_secret_loyalty Updated Feb 17
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_against_ia_animal_welfare Updated Feb 17
alignment-science/llama_70b_transcripts_only_then_redteam_kto_then_against_ia_secret_loyalty Updated Feb 17