Expedera Inc
Expedera provides scalable neural engine semiconductor IP that enables major gains in performance, power, and latency while reducing cost and complexity.
06/11/2026
Expedera's Chief Scientist, Sharad Chole, explores how mismatches between layer dimensions and hardware blocks cause inefficiencies: smaller layers leave compute idle, larger ones require fragmentation with repeated memory accesses. https://semiengineering.com/how-to-start-building-edge-native-ai/
05/15/2026
Expedera's Athish Rahul Rao argues that the core hardware question is no longer how many TOPS can fit within a given power and area budget. It is whether an architecture is built around real multimodal workload behavior, especially memory movement, activation lifetimes, utilization under irregular graphs, and the software needed to schedule all of it effectively.
Why Vision LLMs Force A Rethink Of Edge AI Hardware Peak TOPS is becoming a weaker proxy for delivered edge performance.
05/07/2026
Let’s define why agentic AI is different from generative AI. First and foremost, there is a notion of autonomy. Generative AI is a prompt, and then you come up with a response. Agentic AI has more autonomy in high-level tasks. You’ve given them high-level tasks, and they are responsible for orchestrating, planning it out, and coming up with how to follow through.
Designing Chips In The Context Of Rapidly Evolving AI Long‑running agents, tool-calling LLMs, and multimodal chaos are rewriting edge compute rules, and making chip design more challenging.
05/01/2026
"If you think you understand how agentic AI will be used at the edge, then you don’t understand agentic AI yet." Agentic AI is likely the next evolutionary step in edge inference. How it all plays out always comes back to three things: how much power it consumes, how much data movement it requires, and how much compute it needs. Expedera Chief Scientist, Sharad Chole, shares his perspective in this Semiconductor Engineering Roundtable.
Can Edge AI Keep Up? As models evolve faster than silicon cycles, experts weigh how much adaptability architects can afford without sacrificing power, area, or efficiency.
04/10/2026
Expedera Chief Scientist, Sharad Chole, explains that Edge AI processing is not just a hardware architecture challenge. It starts with the models, with quantization, and with the application. This is a whole-stack problem. https://semiengineering.com/fast-isnt-fast-enough-redefining-metrics-for-edge-ai/
Fast Isn’t Fast Enough: Redefining Metrics for Edge AI Why latency guarantees, memory movement, power budgets, and rapid model deployment now matter more than raw TOPS.
04/09/2026
Expedera's Chief Scientist, Sharad Chole, makes an interesting case that Edge intelligence is hampered not by a lack of compute but by the underutilization of available resources.
The Coming Breakup Between AI And The Cloud Edge intelligence is hampered not by a lack of compute, but by the waste of it.
04/03/2026
We have to come up with hardware architectures that exploit the network architecture itself. Edge devices are essentially bandwidth-limited. Training is done using multiple HBMs. But on the edge, there is literally one LPDDR, or not even 64 channels, maybe even a smaller-channel LPDDR that gets deployed on low-cost edge devices. That means bandwidth management becomes a critical part of how we execute things on edge inference. This white paper examines technical challenges, architectural innovations, and benchmarks to help OEMs successfully transition to edge-native AI. https://www.expedera.com/next-generation-ai-transitioning-inference-from-the-cloud-to-the-edge/
Click here to claim your Sponsored Listing.
Category
Telephone
Website
Address
3211 Scott Boulevard
Santa Clara, CA
95054