Top
New
🔦
veryluckyxyz
joined 9/30/2014, 10:24 PM has 547 karma
POSTS
Hidden drivers of HRM's performance on ARC-AGI
by
veryluckyxyz
on 10/8/2025, 3:54 AM with
2
comments
Set Block Decoding Is a Language Model Inference Accelerator
by
veryluckyxyz
on 9/9/2025, 2:59 AM with
0
comments
Deep Think with Confidence
by
veryluckyxyz
on 8/24/2025, 6:16 PM with
0
comments
A Batch Size and Token NUM- BER Agnostic Learning Rate Scheduler
by
veryluckyxyz
on 6/3/2025, 4:42 AM with
0
comments
Easily Understand Rdma Technology
by
veryluckyxyz
on 6/1/2025, 1:07 PM with
1
comments
Model Merging in Pre-Training of Large Language Models
by
veryluckyxyz
on 5/21/2025, 1:12 AM with
0
comments
Understanding Perception and Reasoning Through Model Merging
by
veryluckyxyz
on 5/15/2025, 3:20 AM with
0
comments
Building and better understanding vision-language models (2024)
by
veryluckyxyz
on 5/10/2025, 3:22 PM with
0
comments
HF smolagents computer-agent demo
by
veryluckyxyz
on 5/7/2025, 1:03 PM with
0
comments
Do Reasoning Models Show Better Verbalized Calibration?
by
veryluckyxyz
on 4/19/2025, 7:48 PM with
0
comments
Robustly identifying concepts introduced during chat fine-tuning with crosscoder
by
veryluckyxyz
on 4/13/2025, 2:57 AM with
0
comments
Retrieval with Learned Similarities
by
veryluckyxyz
on 3/21/2025, 10:53 PM with
0
comments
The Curse of Depth in Large Language Models
by
veryluckyxyz
on 3/21/2025, 5:49 AM with
0
comments
Looking Back at Speculative Decoding
by
veryluckyxyz
on 3/1/2025, 6:24 AM with
5
comments
Long-Context GRPO
by
veryluckyxyz
on 2/21/2025, 4:39 AM with
22
comments
HippoRAG: Neurobiologically Inspired Long-Term Memory for LLMs (2024)
by
veryluckyxyz
on 2/7/2025, 5:34 AM with
4
comments
Learning to Plan and Reason for Evaluation with Thinking-LLM-as-a-Judge
by
veryluckyxyz
on 1/31/2025, 10:56 AM with
0
comments
Process Reinforcement Through Implicit Rewards
by
veryluckyxyz
on 1/3/2025, 5:15 AM with
0
comments
Explaining Large Language Models Decisions Using Shapley Values
by
veryluckyxyz
on 12/28/2024, 12:44 AM with
19
comments
Phi-4 Technical Report
by
veryluckyxyz
on 12/25/2024, 12:07 PM with
0
comments
Alignment Faking in LLMs [pdf]
by
veryluckyxyz
on 12/20/2024, 6:03 AM with
1
comments
What Makes Rotary Positional Encodings Useful?
by
veryluckyxyz
on 11/18/2024, 4:48 AM with
0
comments
Rethinking Softmax: Self-Attention with Polynomial Activations
by
veryluckyxyz
on 10/27/2024, 3:42 AM with
0
comments
Post-Training Layer Scaling Prevents Forgetting and Enhances Model Merging
by
veryluckyxyz
on 10/26/2024, 5:47 AM with
0
comments
Random Matrix Theory in Machine Learning Tutorial
by
veryluckyxyz
on 9/18/2024, 11:56 AM with
0
comments
Rerankers: A Lightweight Python Library to Unify Ranking Methods
by
veryluckyxyz
on 9/17/2024, 5:17 AM with
0
comments
Double Descent Demystified
by
veryluckyxyz
on 9/15/2024, 6:48 PM with
0
comments
Synthetic Continued Pretraining
by
veryluckyxyz
on 9/14/2024, 5:20 AM with
0
comments
Bright: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
by
veryluckyxyz
on 7/21/2024, 4:08 AM with
0
comments
Artificial needles to real haystacks: Improving retrieval capabilities in LLMs
by
veryluckyxyz
on 6/29/2024, 4:55 AM with
21
comments
From Decoding to Meta-Generation: (LLMs)
by
veryluckyxyz
on 6/29/2024, 1:43 AM with
0
comments
Warp: On the Benefits of Weight Averaged Rewarded Policies
by
veryluckyxyz
on 6/26/2024, 5:09 AM with
0
comments
Experiments in Weak-to-Strong Generalization
by
veryluckyxyz
on 6/26/2024, 1:27 AM with
0
comments
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
by
veryluckyxyz
on 5/29/2024, 5:27 AM with
0
comments
A Case Study in CUDA Kernel Fusion
by
veryluckyxyz
on 5/25/2024, 3:04 PM with
0
comments
Lessons from the trenches on reproducible evaluation of language models
by
veryluckyxyz
on 5/25/2024, 11:42 AM with
3
comments
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
by
veryluckyxyz
on 5/22/2024, 4:13 AM with
0
comments
Zero-Shot Tokenizer Transfer
by
veryluckyxyz
on 5/15/2024, 2:41 PM with
0
comments
An Empirical Model of Large-Batch Training
by
veryluckyxyz
on 5/14/2024, 4:48 PM with
0
comments
Gradient Diversity: A Key Ingredient for Scalable Distributed Learning
by
veryluckyxyz
on 5/14/2024, 4:40 PM with
0
comments
Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models
by
veryluckyxyz
on 5/13/2024, 8:48 PM with
0
comments
Automatically Detecting Under-Trained Tokens in Large Language Models
by
veryluckyxyz
on 5/12/2024, 6:46 AM with
26
comments
Large Language Models for Data Annotation: A Survey
by
veryluckyxyz
on 5/4/2024, 10:35 PM with
0
comments
Refusal in LLMs is mediated by a single direction
by
veryluckyxyz
on 5/3/2024, 12:55 AM with
20
comments
Automated Multi Agent Chat
by
veryluckyxyz
on 5/2/2024, 5:56 AM with
0
comments
Orca: A Distributed Serving System for Transformer-Based Generative Models
by
veryluckyxyz
on 4/30/2024, 12:29 PM with
1
comments
Understanding Emergent Abilities of Language Models from the Loss Perspective
by
veryluckyxyz
on 4/30/2024, 11:44 AM with
1
comments
LoRA+: Efficient Low Rank Adaptation of Large Models
by
veryluckyxyz
on 4/28/2024, 1:41 PM with
47
comments
Does Transformer Interpretability Transfer to RNNs?
by
veryluckyxyz
on 4/10/2024, 12:37 PM with
0
comments
MiniCPM: Potential of Small Language Models W Scalable Training Strategies
by
veryluckyxyz
on 4/10/2024, 12:25 PM with
0
comments
Building BerkeleyDB
by
veryluckyxyz
on 4/10/2024, 11:47 AM with
0
comments
Rotational Equilibrium: How Weight Decay Balances Learning Across NeuralNetworks
by
veryluckyxyz
on 3/25/2024, 12:35 AM with
0
comments
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
by
veryluckyxyz
on 3/17/2024, 12:23 AM with
0
comments
Bad arguments against a universal basic income
by
veryluckyxyz
on 6/1/2016, 2:38 AM with
3
comments
The MOOC revolution that wasn’t
by
veryluckyxyz
on 8/24/2015, 9:58 PM with
0
comments
Tech industry's persistent claim of worker shortage may be phony
by
veryluckyxyz
on 8/3/2015, 3:22 AM with
189
comments
Why We've Decided to Organize
by
veryluckyxyz
on 4/21/2015, 6:25 PM with
comments
Ask HN: Where can I get info about who voted (up or down) articles/comments ?
by
veryluckyxyz
on 1/16/2015, 6:15 PM with
4
comments
Ask HN: How are Reddit and HN different for you?
by
veryluckyxyz
on 9/30/2014, 10:28 PM with
6
comments