Hacker News Clone

veryluckyxyz

joined 9/30/2014, 10:24 PM has 547 karma

POSTS

Hidden drivers of HRM's performance on ARC-AGI
by veryluckyxyz on 10/8/2025, 3:54 AM with 2 comments
Set Block Decoding Is a Language Model Inference Accelerator
by veryluckyxyz on 9/9/2025, 2:59 AM with 0 comments
Deep Think with Confidence
by veryluckyxyz on 8/24/2025, 6:16 PM with 0 comments
A Batch Size and Token NUM- BER Agnostic Learning Rate Scheduler
by veryluckyxyz on 6/3/2025, 4:42 AM with 0 comments
Easily Understand Rdma Technology
by veryluckyxyz on 6/1/2025, 1:07 PM with 1 comments
Model Merging in Pre-Training of Large Language Models
by veryluckyxyz on 5/21/2025, 1:12 AM with 0 comments
Understanding Perception and Reasoning Through Model Merging
by veryluckyxyz on 5/15/2025, 3:20 AM with 0 comments
Building and better understanding vision-language models (2024)
by veryluckyxyz on 5/10/2025, 3:22 PM with 0 comments
HF smolagents computer-agent demo
by veryluckyxyz on 5/7/2025, 1:03 PM with 0 comments
Do Reasoning Models Show Better Verbalized Calibration?
by veryluckyxyz on 4/19/2025, 7:48 PM with 0 comments
Robustly identifying concepts introduced during chat fine-tuning with crosscoder
by veryluckyxyz on 4/13/2025, 2:57 AM with 0 comments
Retrieval with Learned Similarities
by veryluckyxyz on 3/21/2025, 10:53 PM with 0 comments
The Curse of Depth in Large Language Models
by veryluckyxyz on 3/21/2025, 5:49 AM with 0 comments
Looking Back at Speculative Decoding
by veryluckyxyz on 3/1/2025, 6:24 AM with 5 comments
Long-Context GRPO
by veryluckyxyz on 2/21/2025, 4:39 AM with 22 comments
HippoRAG: Neurobiologically Inspired Long-Term Memory for LLMs (2024)
by veryluckyxyz on 2/7/2025, 5:34 AM with 4 comments
Learning to Plan and Reason for Evaluation with Thinking-LLM-as-a-Judge
by veryluckyxyz on 1/31/2025, 10:56 AM with 0 comments
Process Reinforcement Through Implicit Rewards
by veryluckyxyz on 1/3/2025, 5:15 AM with 0 comments
Explaining Large Language Models Decisions Using Shapley Values
by veryluckyxyz on 12/28/2024, 12:44 AM with 19 comments
Phi-4 Technical Report
by veryluckyxyz on 12/25/2024, 12:07 PM with 0 comments
Alignment Faking in LLMs [pdf]
by veryluckyxyz on 12/20/2024, 6:03 AM with 1 comments
What Makes Rotary Positional Encodings Useful?
by veryluckyxyz on 11/18/2024, 4:48 AM with 0 comments
Rethinking Softmax: Self-Attention with Polynomial Activations
by veryluckyxyz on 10/27/2024, 3:42 AM with 0 comments
Post-Training Layer Scaling Prevents Forgetting and Enhances Model Merging
by veryluckyxyz on 10/26/2024, 5:47 AM with 0 comments
Random Matrix Theory in Machine Learning Tutorial
by veryluckyxyz on 9/18/2024, 11:56 AM with 0 comments
Rerankers: A Lightweight Python Library to Unify Ranking Methods
by veryluckyxyz on 9/17/2024, 5:17 AM with 0 comments
Double Descent Demystified
by veryluckyxyz on 9/15/2024, 6:48 PM with 0 comments
Synthetic Continued Pretraining
by veryluckyxyz on 9/14/2024, 5:20 AM with 0 comments
Bright: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
by veryluckyxyz on 7/21/2024, 4:08 AM with 0 comments
Artificial needles to real haystacks: Improving retrieval capabilities in LLMs
by veryluckyxyz on 6/29/2024, 4:55 AM with 21 comments
From Decoding to Meta-Generation: (LLMs)
by veryluckyxyz on 6/29/2024, 1:43 AM with 0 comments
Warp: On the Benefits of Weight Averaged Rewarded Policies
by veryluckyxyz on 6/26/2024, 5:09 AM with 0 comments
Experiments in Weak-to-Strong Generalization
by veryluckyxyz on 6/26/2024, 1:27 AM with 0 comments
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
by veryluckyxyz on 5/29/2024, 5:27 AM with 0 comments
A Case Study in CUDA Kernel Fusion
by veryluckyxyz on 5/25/2024, 3:04 PM with 0 comments
Lessons from the trenches on reproducible evaluation of language models
by veryluckyxyz on 5/25/2024, 11:42 AM with 3 comments
Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet
by veryluckyxyz on 5/22/2024, 4:13 AM with 0 comments
Zero-Shot Tokenizer Transfer
by veryluckyxyz on 5/15/2024, 2:41 PM with 0 comments
An Empirical Model of Large-Batch Training
by veryluckyxyz on 5/14/2024, 4:48 PM with 0 comments
Gradient Diversity: A Key Ingredient for Scalable Distributed Learning
by veryluckyxyz on 5/14/2024, 4:40 PM with 0 comments
Arctic-Embed: Scalable, Efficient, and Accurate Text Embedding Models
by veryluckyxyz on 5/13/2024, 8:48 PM with 0 comments
Automatically Detecting Under-Trained Tokens in Large Language Models
by veryluckyxyz on 5/12/2024, 6:46 AM with 26 comments
Large Language Models for Data Annotation: A Survey
by veryluckyxyz on 5/4/2024, 10:35 PM with 0 comments
Refusal in LLMs is mediated by a single direction
by veryluckyxyz on 5/3/2024, 12:55 AM with 20 comments
Automated Multi Agent Chat
by veryluckyxyz on 5/2/2024, 5:56 AM with 0 comments
Orca: A Distributed Serving System for Transformer-Based Generative Models
by veryluckyxyz on 4/30/2024, 12:29 PM with 1 comments
Understanding Emergent Abilities of Language Models from the Loss Perspective
by veryluckyxyz on 4/30/2024, 11:44 AM with 1 comments
LoRA+: Efficient Low Rank Adaptation of Large Models
by veryluckyxyz on 4/28/2024, 1:41 PM with 47 comments
Does Transformer Interpretability Transfer to RNNs?
by veryluckyxyz on 4/10/2024, 12:37 PM with 0 comments
MiniCPM: Potential of Small Language Models W Scalable Training Strategies
by veryluckyxyz on 4/10/2024, 12:25 PM with 0 comments
Building BerkeleyDB
by veryluckyxyz on 4/10/2024, 11:47 AM with 0 comments
Rotational Equilibrium: How Weight Decay Balances Learning Across NeuralNetworks
by veryluckyxyz on 3/25/2024, 12:35 AM with 0 comments
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
by veryluckyxyz on 3/17/2024, 12:23 AM with 0 comments
Bad arguments against a universal basic income
by veryluckyxyz on 6/1/2016, 2:38 AM with 3 comments
The MOOC revolution that wasn’t
by veryluckyxyz on 8/24/2015, 9:58 PM with 0 comments
Tech industry's persistent claim of worker shortage may be phony
by veryluckyxyz on 8/3/2015, 3:22 AM with 189 comments
Why We've Decided to Organize
by veryluckyxyz on 4/21/2015, 6:25 PM with comments
Ask HN: Where can I get info about who voted (up or down) articles/comments ?
by veryluckyxyz on 1/16/2015, 6:15 PM with 4 comments
Ask HN: How are Reddit and HN different for you?
by veryluckyxyz on 9/30/2014, 10:28 PM with 6 comments