Two Sigma Ventures’ Post

View organization page for Two Sigma Ventures, graphic

28,165 followers

1mo

Announcing our investment in Etched's Series A! 🎯 Their mission: Solve AI's compute crunch. GPUs are hitting a wall; they’re getting bigger, but not necessarily better. 🗡 Their weapon: Sohu, a transformer-specific chip. 🚀 The potential: 20x faster AI at 1/20th the cost; That means real-time video gen, instant agents, & more. We're excited to be betting big on specialized AI hardware alongside Etched! Read more about their specialized chip (ASIC), Sohu, below.

Etched

8,053 followers

1mo Edited

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance. One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server. We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners. We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more. We’re on track for one of the fastest chip launches in history: - Top hardware engineers and AI researchers have left every major AI chip project to join us. - We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production. - Our early customers have reserved tens of millions of dollars of our hardware. As we hit the limits of speed, cost, and scale on GPUs, specialized chips are inevitable. If you want to change the future of AI compute, please join us at www.etched.com/careers. (Benchmarks are from running in FP8 without sparsity at 8x model parallelism with 2048 input/128 output lengths. 8xH100s figures are from TensorRT-LLM 0.10.08 (latest version), and 8xB200 figures are estimated. This is the same benchmark NVIDIA and AMD use.)

To view or add a comment, sign in

More Relevant Posts

Shawn Wilson

Unix Dev*Ops Engineer
3w
Report this post
There are lots of markets that need disruption (or just more investment) here. Like energy (I bet people are upset that they didn’t try to build more nuclear plants around datacenters right now), cooling (kinda surprised there’s not more industrial scale mineral oil setups now), PaaS with GPUs, vram dumping apps/processes, etc. There’s obvious bottlenecks here - nVidia only being the most obvious that need to be removed before this AI can mature much more. Also, blockchain keeps going through bubbles. What’s going to happen the next time there’s a big cryptocurrency push? There’s no more ASICs and everything wants PoW (or ownership). So if a bunch of bankers are willing to pay for GPU time, how many other industries are going to want to outbid them?
Etched

8,053 followers
1mo Edited

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance. One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server. We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners. We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more. We’re on track for one of the fastest chip launches in history: - Top hardware engineers and AI researchers have left every major AI chip project to join us. - We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production. - Our early customers have reserved tens of millions of dollars of our hardware. As we hit the limits of speed, cost, and scale on GPUs, specialized chips are inevitable. If you want to change the future of AI compute, please join us at www.etched.com/careers. (Benchmarks are from running in FP8 without sparsity at 8x model parallelism with 2048 input/128 output lengths. 8xH100s figures are from TensorRT-LLM 0.10.08 (latest version), and 8xB200 figures are estimated. This is the same benchmark NVIDIA and AMD use.)
Like Comment
To view or add a comment, sign in
Darsh Patel

Cybersecurity | NetOps | Machine Learning
3w
Report this post
This is an absolute game-changer. While GPUs and TPUs offer flexibility over various models like CNNs, RNNs, and GANs, Etched is creating faster chips that can ONLY run Transformers, BURNING them into the chip itself! Wowww. This is insane. Etched 🔥 #AI #Startups
Etched

8,053 followers
1mo Edited

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance. One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server. We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners. We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more. We’re on track for one of the fastest chip launches in history: - Top hardware engineers and AI researchers have left every major AI chip project to join us. - We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production. - Our early customers have reserved tens of millions of dollars of our hardware. As we hit the limits of speed, cost, and scale on GPUs, specialized chips are inevitable. If you want to change the future of AI compute, please join us at www.etched.com/careers. (Benchmarks are from running in FP8 without sparsity at 8x model parallelism with 2048 input/128 output lengths. 8xH100s figures are from TensorRT-LLM 0.10.08 (latest version), and 8xB200 figures are estimated. This is the same benchmark NVIDIA and AMD use.)
Like Comment
To view or add a comment, sign in
Dawn Lippert

Elemental Excelerator + Earthshot Ventures + Emerson Collective
1mo Edited
Report this post
Ahh, the intersection of AI + climate. These two mega trends are already impacting each other in enormous ways, which is why I'm so excited to share that Earthshot Ventures is investing in Etched, launching today with $120m to scale Sohu, the world’s first specialized -- and most efficient -- chip for transformers (the “T” in ChatGPT). The Intl Energy Agency estimates that by 2026, AI processing demand could use 1,000 TWh (!) “This demand is roughly equivalent to the electricity consumption of Japan,” says the IEA. Why did we invest? - The chip has demonstrated huge efficiency gains: a 15x increase in throughput and a 15x improvement in energy consumption. - Etched already is seeing significant demand, with tens of millions of $ in pre orders. - We believe in the brilliant founding team including CEO Gavin Uberti, a world math champion who identified the Etched opportunity while working for AI startups, and COO Robert W., who previously founded an AI startup accelerator. They have also recruited outstanding industry veterans from every major chip manufacturer. Mike Jackson, Matt Logan, Ramsay Siegal, Austin Blackmon, Garrett Apel, Brianna Rodrigues
Etched

8,053 followers
1mo Edited

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance. One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server. We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners. We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more. We’re on track for one of the fastest chip launches in history: - Top hardware engineers and AI researchers have left every major AI chip project to join us. - We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production. - Our early customers have reserved tens of millions of dollars of our hardware. As we hit the limits of speed, cost, and scale on GPUs, specialized chips are inevitable. If you want to change the future of AI compute, please join us at www.etched.com/careers. (Benchmarks are from running in FP8 without sparsity at 8x model parallelism with 2048 input/128 output lengths. 8xH100s figures are from TensorRT-LLM 0.10.08 (latest version), and 8xB200 figures are estimated. This is the same benchmark NVIDIA and AMD use.)
7 Comments
Like Comment
To view or add a comment, sign in
Dr. Joyjit Chatterjee

Forbes 30U30 Europe | Data Scientist- CI @ Reckitt, UK | Green Talents Awardee (German Govt.) | PhD & Postdoc in AI (UofHull) | E&C Engineer (Gold Medalist) | UKRI Endorsed Global Talent | AI for Sustainability
1mo
Report this post
Etched has just raised $120M for bringing to the market Sohu - world's fastest ASIC AI chip specific for Transformer models. Transformer is the latest and greatest model behind ChatGPT, Gemini, Stable Diffusion, Sora etc., and they are particularly well known for their parallel processing and multi-head attention capabilities, which would mean almost everything in GenAI space today can be trained and deployed via these chips. As long as Transformer continues to be prominent in the AI community (it doesn't seem Transformers are going away anytime soon!), Sohu will likely continue to thrive and can be highly transformative for the AI industry. #artificialintelligence #transformers #deeplearning #ai #ml #industry #innovation
Etched

8,053 followers
1mo Edited

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance. One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server. We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners. We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more. We’re on track for one of the fastest chip launches in history: - Top hardware engineers and AI researchers have left every major AI chip project to join us. - We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production. - Our early customers have reserved tens of millions of dollars of our hardware. As we hit the limits of speed, cost, and scale on GPUs, specialized chips are inevitable. If you want to change the future of AI compute, please join us at www.etched.com/careers. (Benchmarks are from running in FP8 without sparsity at 8x model parallelism with 2048 input/128 output lengths. 8xH100s figures are from TensorRT-LLM 0.10.08 (latest version), and 8xB200 figures are estimated. This is the same benchmark NVIDIA and AMD use.)
Like Comment
To view or add a comment, sign in
Tan Huynh

Metadata and Ontology
1mo
Report this post
I am predicting inferencing token cost will be cut down 4x by 2025. More specialized hardware will come (soon) to obtain the transferable value from NVIDIA ($$$ remains in the AI bucket, but different stock) Generic-purpose GPU does for both model building and inferencing will be of similar to Xerox print/scan/multi-functions machine back in the 70. You are paying for the whole baggage, which is translated into the cloud billing. Embedded decentralized AI appliances will soon be available, once again, manufactured by the Chinese as an alternative to US Chip-Act. Before we know it, the entire world will be flooded with AI smart devices that can host an internal embedded offline transformer model. GenAI is now just a story in your company journey, whether it's a startup, midmarket, enterprise.
Etched

8,053 followers
1mo Edited

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance. One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server. We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners. We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more. We’re on track for one of the fastest chip launches in history: - Top hardware engineers and AI researchers have left every major AI chip project to join us. - We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production. - Our early customers have reserved tens of millions of dollars of our hardware. As we hit the limits of speed, cost, and scale on GPUs, specialized chips are inevitable. If you want to change the future of AI compute, please join us at www.etched.com/careers. (Benchmarks are from running in FP8 without sparsity at 8x model parallelism with 2048 input/128 output lengths. 8xH100s figures are from TensorRT-LLM 0.10.08 (latest version), and 8xB200 figures are estimated. This is the same benchmark NVIDIA and AMD use.)
1 Comment
Like Comment
To view or add a comment, sign in
Etched

8,053 followers
1mo Edited
Report this post
Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance. One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server. We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners. We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more. We’re on track for one of the fastest chip launches in history: - Top hardware engineers and AI researchers have left every major AI chip project to join us. - We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production. - Our early customers have reserved tens of millions of dollars of our hardware. As we hit the limits of speed, cost, and scale on GPUs, specialized chips are inevitable. If you want to change the future of AI compute, please join us at www.etched.com/careers. (Benchmarks are from running in FP8 without sparsity at 8x model parallelism with 2048 input/128 output lengths. 8xH100s figures are from TensorRT-LLM 0.10.08 (latest version), and 8xB200 figures are estimated. This is the same benchmark NVIDIA and AMD use.)
111 Comments
Like Comment
To view or add a comment, sign in
Rana el Kaliouby, Ph.D.
1mo Edited
Report this post
The AI chip landscape, currently dominated by NVIDIA is SO ripe for disruption. Congratulations Etched on your $120M Series A round. I am so proud to be an investor. Training AI models today cost billions of dollars (not to mention that training these models consumes the power equivalent of what a country like Costa Rica consumes in a whole year!). At the current pace, our hardware, our power grids, and pocketbooks can’t keep up. Etched is changing this. In 2022, they made a bet that transformers (the “T” in ChatGPT) would take over the world. The team made that bet even before OpenAI released ChatGPT to the public - how cool! Today, every state-of-the-art AI model is a transformer: ChatGPT, Sora, Gemini, Stable Diffusion 3, and more. The team Etched spent the past two years building Sohu, the world’s first specialized chip (ASIC) for transformers. Sohu is an order of magnitude faster and cheaper than NVIDIA’s next generation of Blackwell (GB200) GPUs when running text, image, and video transformers. Etched is making one the biggest bets in #AI right now and they are on track for one of the fastest chip launches in history. If they pull this off, every AI product will run on their chips. Excited to be on this journey with the team. Gavin Uberti Robert W. Ps. Thank you Taryn Southern, Ocean Ventures for bringing me into this opportunity. Primary Venture Partners Positive Sum Two Sigma Ventures Skybox Datacenters Hummingbird Ventures Oceans Fundomo Velvet Sea Ventures Fontinalis Partners Galaxy Earthshot Ventures Lightscape Partners
Etched

8,053 followers
1mo Edited

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance. One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server. We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners. We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more. We’re on track for one of the fastest chip launches in history: - Top hardware engineers and AI researchers have left every major AI chip project to join us. - We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production. - Our early customers have reserved tens of millions of dollars of our hardware. As we hit the limits of speed, cost, and scale on GPUs, specialized chips are inevitable. If you want to change the future of AI compute, please join us at www.etched.com/careers. (Benchmarks are from running in FP8 without sparsity at 8x model parallelism with 2048 input/128 output lengths. 8xH100s figures are from TensorRT-LLM 0.10.08 (latest version), and 8xB200 figures are estimated. This is the same benchmark NVIDIA and AMD use.)
13 Comments
Like Comment
To view or add a comment, sign in
Rajgopal A S

Managing Director & Chief Executive Officer | Business Administration
1mo
Report this post
Over next couple years a lot will change in the AI space. For us, this is not the time to commit to one GPU platform, but to build capabilities to help fine-tune models on any platform and help customers get the best value. A novel approach- Sohu, will bring efficiency over the current way of leveraging GPUs by using ASICs for running specific models. We should closely follow progress being made by such innovations.
Etched

8,053 followers
1mo Edited

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance. One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server. We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners. We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more. We’re on track for one of the fastest chip launches in history: - Top hardware engineers and AI researchers have left every major AI chip project to join us. - We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production. - Our early customers have reserved tens of millions of dollars of our hardware. As we hit the limits of speed, cost, and scale on GPUs, specialized chips are inevitable. If you want to change the future of AI compute, please join us at www.etched.com/careers. (Benchmarks are from running in FP8 without sparsity at 8x model parallelism with 2048 input/128 output lengths. 8xH100s figures are from TensorRT-LLM 0.10.08 (latest version), and 8xB200 figures are estimated. This is the same benchmark NVIDIA and AMD use.)
2 Comments
Like Comment
To view or add a comment, sign in
Parag Paul

Co-Founder(AI Startup)| Post Doc (SBGCE) | Auburn (PhD) | Univ. of Washington(M.S) | Anna Uni | ServiceNow | Microsoft(SQL Server Engine)| Synopsys(VLSI) | Patents | Books | Conf. Panelist | Community Editor | IEEE pubs
1mo
Report this post
Keeping this post handy and in my timeline. Will update a few interesting findings in the subsequent articles. #AIStartup #AI #Founders #AIFOUNDERS #startups
Etched

8,053 followers
1mo Edited

Meet Sohu, the fastest AI chip of all time. With over 500,000 tokens per second in Llama 70B throughput, Sohu lets you build products impossible on GPUs. Sohu is the world’s first specialized chip (ASIC) for transformers (the “T” in ChatGPT). By burning the transformer architecture into our chip, we can’t run most traditional AI models. But for generative AI models, like ChatGPT (text), SD3 (images), and Sora (video), Sohu has unparalleled performance. One Sohu server runs over 500,000 Llama 70B tokens per second: >20x more than an H100 server (23,000 tokens/sec), and >10x more than a B200 server. We recently raised $120M from Primary Venture Partners and Positive Sum, with participation from Two Sigma Ventures, Skybox Datacenters, Hummingbird Ventures, Oceans, Fundomo, Velvet Sea Ventures, Fontinalis Partners, Galaxy, Earthshot Ventures, Max Ventures and Lightscape Partners. We’re grateful for the support of industry leaders, including Peter Thiel, David Siegel, Thomas Dohmke, Jason Warner, Amjad Masad, Kyle Vogt, Stanley Freeman Druckenmiller, and many more. We’re on track for one of the fastest chip launches in history: - Top hardware engineers and AI researchers have left every major AI chip project to join us. - We’ve partnered directly with TSMC on their 4nm process. We’ve secured HBM and server supply from top vendors and can quickly ramp our first year of production. - Our early customers have reserved tens of millions of dollars of our hardware. As we hit the limits of speed, cost, and scale on GPUs, specialized chips are inevitable. If you want to change the future of AI compute, please join us at www.etched.com/careers. (Benchmarks are from running in FP8 without sparsity at 8x model parallelism with 2048 input/128 output lengths. 8xH100s figures are from TensorRT-LLM 0.10.08 (latest version), and 8xB200 figures are estimated. This is the same benchmark NVIDIA and AMD use.)
Like Comment
To view or add a comment, sign in
Nagesh Nama

CEO at xLM | Transforming Life Sciences with AI & ML | Pioneer in GxP Continuous Validation |
1mo Edited
Report this post
Etched-ai, a San Francisco-based AI startup, has developed the world's first transformer-specific chip, named Sohu. This chip is an application-specific integrated circuit (#ASIC) designed exclusively to run transformer models. 1. Performance: #Sohu is claimed to be significantly faster and more efficient than current state-of-the-art GPUs. For instance, an 8xSohu server can process 500,000 tokens per second, compared to 25,000 tokens per second for an 8xNVIDIA H100 GPU cluster and 43,000 tokens per second for an 8xNVIDIA Blackwell (GB200) GPU cluster. 2. Efficiency: The Sohu chip is designed to utilize 90% of its floating-point operations per second (FLOPS), whereas traditional GPUs typically achieve only 30-40% utilization. This high efficiency translates to lower power consumption and reduced operational costs. 3. Specialization: By focusing solely on transformer models, Sohu eliminates the need for hardware components and software overhead required for other types of models. This specialization allows for streamlined inferencing and software, making it highly optimized for its intended use. 4. Scalability: Sohu is designed to handle extremely large AI models, supporting up to 100 trillion parameters. This makes it suitable for future AI models that are expected to be much larger and more complex than current ones. 5. Open-Source Software Stack: The chip comes with a fully open-source software stack, which can facilitate easier integration and customization by developers. 𝙎𝙩𝙧𝙖𝙩𝙚𝙜𝙞𝙘 𝙖𝙣𝙙 𝙈𝙖𝙧𝙠𝙚𝙩 𝙄𝙢𝙥𝙡𝙞𝙘𝙖𝙩𝙞𝙤𝙣𝙨 Etched has raised $120 million in a series-A funding round to further develop and bring Sohu to market. The company has partnered with Taiwan Semiconductor Manufacturing Co. (@company_tsmc) for the manufacturing of the chips, which are produced using TSMC's 4 nm process. The introduction of Sohu could significantly impact the AI hardware market, which is currently dominated by general-purpose GPUs from companies like NVIDIA. By offering a specialized solution that is both faster and more cost-effective, Etched aims to capture a significant share of the market for AI inference, particularly in applications requiring real-time processing and large-scale model deployment. 𝙋𝙤𝙩𝙚𝙣𝙩𝙞𝙖𝙡 𝘾𝙝𝙖𝙡𝙡𝙚𝙣𝙜𝙚𝙨 While Sohu offers impressive performance benefits, its success is closely tied to the continued dominance of transformer models in AI. If new architectures emerge that outperform transformers, the specialized nature of Sohu could become a limitation. However, Etched's CEO Gavin Uberti has indicated that the company is prepared to adapt if necessary. https://www.etched.com #AI #ArtificialIntelligecne #AIChips
1 Comment
Like Comment
To view or add a comment, sign in

28,165 followers

View Profile Follow

Two Sigma Ventures’ Post

More Relevant Posts

Explore topics