Imagine canning your 7nm process last minute only few years before the chip shortage.
Must be the mostmoronicdecisionever.
and it's not like 20/20 hindsight either, because every hardware enthusiast knew at the time Intel was having troubles and was worried TSMC (and Samsung at the time) were going to be the only fabs producing leading edge lithographies.
I think it would require some work to call it a “moronic decision.” My suspicion is that even if they could see the future and predict that shortage, 7nm by 2020/2021 was not on the table for them.
These nm values are really bullshit anyway, but the tech node that was supposed to be Intel’s 7nm, which ended up being called “Intel 4” (because they branded some 10nm tech as Intel 7), only came out in like 2023. Given they Global Foundries was always behind Intel, suddenly leapfrogging them by 2-3 years would be quite a feat.
Oh no, it is a moronic decision and everyone thought so even then. It was a competitive process, they said volume production was due in late 2018 and they canned it at the very last minute citing it financially not feasible. You can read details at this news article (https://www.anandtech.com/show/13277/globalfoundries-stops-a...) or thousands of forum discussions regarding the news. No need to even look that far, just skimp the discussions on the forum topic below the news article I linked and it was plain as a day to anyone what would happen.
> These nm values are really bullshit anyway, but the tech node that was supposed to be Intel’s 7nm, which ended up being called “Intel 4” (because they branded some 10nm tech as Intel 7), only came out in like 2023. Given they Global Foundries was always behind Intel, suddenly leapfrogging them by 2-3 years would be quite a feat.
This is a very weak argument. Intel was ahead of everyone, now everyone is ahead of Intel. Remember TSMC's blunder processes like 20nm? How they turned around after that? Or how GloFo has had always mediocre processes but they finally hit the nail in the head with their 14/12nm? Fab business has always had companies leapfrogging each other, it turns out the worst sin is not trying. GloFo's greedy investors chose to bury the business in the ground for their short term profits.
I thought it was a bad decision at the time, but it does seem like a defensible one to me, for three reasons.
First, nobody knew if even TSMC was going to succeed at bringing a 7nm process to market. 02018 was maybe the height of the "Moore's Law is over" belief. There was a lot of debate about whether planar semiconductor scaling had finally reached the limit of practical feasibility, although clearly it was still two orders of magnitude from the single-atom physical limit, which had been reached by Xie's lab in 02002. Like Intel, SMIC didn't reach 7nm until 02023 (with the HiSilicon processor for Huawei's Mate60 cellphone) despite having the full backing of the world's most technically productive country, and when they did, it was a shocking surprise in international relations with the US.
Second, even if GF had brought 7nm to market, there was no guarantee it would be profitable. The most profitable companies in a market are not always the most technically advanced; often the pioneers die with arrows in their backs. If you can make 7nm chips in volume, but the price for them is so high that almost everyone sticks with 12nm processes (maybe from your competitors), you can still lose money on the R&D. Moore's Law as originally stated in "Cramming" was about how the minimum price per transistor kept moving to smaller and smaller transistors, and historically that has been an immensely strong impetus to move to smaller processes, but it's clearly weakened in recent years, with many successful semiconductor products like high-end FPGAs still shipping on very old process nodes. (Leaving aside analog, which is a huge market that doesn't benefit from smaller feature size.)
Third, we don't know what the situation inside GF was, and maybe GF's CEO did. Maybe they'd just lost all their most important talent to TSMC or Samsung, so their 7nm project was doomed. Maybe their management politics were internally dysfunctional in a way that blocked progress on 7nm, even if it hadn't been canceled. There's no guarantee that GF would have been successful at mass production of 7nm chips even in a technical sense, no matter how much money they spent on it.
In the end it seems like GF lost the bet pretty badly. But that doesn't necessarily imply that it was the wrong bet. Just, probably.
As far as I know, Global Foundries ceased efforts at 7nm and lower because they could not afford it.
They had previously signed a contract with IBM to produce silicon at these more advanced nodes that they could not honor, and there was legal action between them.
Yeah and IBM had to move their designs in the last minute from GF to Samsung. I have heard that the Samsung process was much better and the tech transfer was easier than expected.
First, you point out that Moore's law was about the transistor count per chip at the optimum cost process, and that's very important. We have transitioned from a more-for-less leading edge to a more-for-more leading edge. It's overall sensible for Apple to build giant chips on the newest processor not because it's cheaper but because it gives them an overall more competitive product (they only sell whole devices). Just because Apple and Nvidia keep making bigger chips doesn't mean that Moore's law is working the way it was originally proposed (Intel's marketing department notwithstanding).
In any case, at the time and still I think GF was probably correct in that they would not be able to compete at the leading edge and make money at it. Remember, AMD and IBM separated fabs out for a reason and not having the scale necessary to compete was probably a big part of that. AMD has succeeded on TSMC and IBM seems to be doing ok on Samsung. Most chips are not at the leading edge and don't need to be, and so most fabs don't need to be leading edge to serve customers. There are all kinds of applications where a more mature and better characterized process is better, whether for harsh environments, mixed signal applications, or just low volume parts where $20M of tooling cost is not worth it.
Its odd all these MBAs and few in the tech space appear to know that when a technology company stops investing in the future they are done. It might take 20+ years for that to happen but it will. Sure, stretch the timeline for the next node/product/etc but _NEVER_ stop pushing the enveloper because if you can't invest in it now, you won't be able to in a few years time when your resources are even more constrained as your customer base dwindles, or your technology becomes more commoditized or simply left behind as companies that did invest no longer have a need for their older products/lines.
Do you have any evidence, besides GF's own PR/IR department, that the process ever actually worked in volume? Because from my point of view, how they ended things looks exactly how I would spin away a multibillion-dollar investment into a failed process.
No I don't, but then, how bad it could be? As bad as Samsung's 8nm? Or Intel's 10nm? Even they delivered something in the end. What did GF deliver? A whole fucking nothing. Samsung had Nvidia and Qualcomm as their customers even with its, ehm, not so good 8nm process. It was a sure bet GF was going to have some customers as long as they delivered something (and I don't even count AMD's wafer supply agreement).
It could be arbitrarily bad. 1% yields, 0.01% yields, 0.00001% yields. Having to write each wafer with an electron beam because they couldn't get EUV to work at 7nm.
It could be, but on the other hand it could be freaking fantastic, too. The only way we'd know if they've fucking did it, which is my point.
Since it didn't happen, the only thing we know is what they said and they said it was because of "strategic shift"
> Tom Caulfield also mentioned GF needed $3 billion dollars of additional capital to get to 12,000 wpm and they could only fund half of it through cash flow, they would have to borrow the other half and the projected return wasn’t good.
> When Tom took over as CEO he went out on the road and visited GF’s customers. What he found was a lack of commitment to GF’s 7 nm process in the customer base. Many customers were never going to go to 7 nm and of the customers who were, GF wouldn’t have enough capacity to meet their demands. There was also concern in the customer base that 7 nm would take up all the R&D and capital budgets and starve the other processes they wanted to use of investment.
I think the proof of the pudding is in the eating: it's been seven years since the cancellation of 7LP. They have launched nothing even near the leading edge since 12LP+.
If 7LP worked, given this market and its hunger for capacity, it'd be in production at at least small scale. Equipment costs are down and knowledge has disseminated, making it a lot cheaper to launch, especially as "7nm" isn't the leading edge any more.
Duh. Of course it doesn't work, because they cancelled it in 2018!
> making it a lot cheaper to launch, especially as "7nm" isn't the leading edge any more.
Same logic cuts both ways. If they didn't think it was financially viable in 2018 when it'd be a leading edge process and their customers would be willing to pay top dollar for it, why would they think it'd be feasible now when it isn't the leading edge lithography and nobody would be paying top dollar for it?
On top of that I doubt even your claim that it'd be cheaper to do the investment now would hold given how everything got more expensive since 2018. I'm also doubtful that machines got cheaper since ASML is still the only ones building them and they've probably got their hands full with their existing customers. They'd probably laugh at GloFo if they'd come with a request like that "Sorry GloFo, we're already booked until 2030 building machines for TSMC, Intel and Samsung maybe try at 2032" :P
GloFo got off the train and there's no going back.
>> Fab business has always had companies leapfrogging each other, it turns out the worst sin is not trying. GloFo's greedy investors chose to bury the business in the ground for their short term profits.
Name company making chips with EUV that is not TSMC, Samsung, or Intel?
What I remember from discussions at the time was they were going to tape out very soon and start building for mass production, then UAE fund noped when things got serious.
It does mean GF is on the path to long slow decline. The decision was not "we will wait 5-10 years" but "we will not develop any new processes".
I don't fault them for failing to predict the chip shortage and huge opportunity to acquire customers that would result. The fact remains: they will eventually fade away.
Yeah, that was my reading at the time. But lots of companies have gone that direction, closing down major lines of business because they couldn't make money at them anymore. I mean, you probably remember this, but IBM used to make computers. Intel started out making RAM. HP used to make working products.
That was a huge gift to AMD since it let them use TSMC as for fabrication instead, and they gained a process node advantage over Intel for the first time in history.
My guess is that the guys in Abu Dhabi did not want to do the investments needed to bring 7nm into production. They lost a huge opportunity because of that. At the time, it probably looked like the right financial decision to them, even though practically everyone affected downstream thought it was myopic.
"Samsung expects to be in production late this year with a 14 nm FinFET process it has developed. GlobalFoundries has licensed the process and will have it in production early next year."
GlobalFoundries licensed 14nm from Samsung. How do you know GlobalFoundries is capable of 7nm?
I think they've over-corrected from the two digits are enough truncation that was common in computers between the 1950s and ~2000. It started to become less common then, but the phase out is arguably still going or stalled until things just die.
However OCTAL (leading zero) prefixing of a text mode number fails on a number of points:
* It's still a fixed register size (5 characters), which will overflow on the year 100000 AD.
* It's confusing, everyone else.
* It's not technically correct. (human behavior)
Truncating to two year digits was confusing because ambiguity. There is no ambiguity if a number encoded in decimal uses precisely the number of characters it needs. That's how normal humans normally write numbers.
It's "Long Now" stuff, which really should be called "Medium Now" because they're only using one leading zero.
kragen thinks making most of his readers glitch for a second every time they read one of his dates is worth it on order to advertise for the Long Now. Really unfortunate choice, since he often has decent information to share.
You're the one claiming it's the "most moronic decision ever".
The burden of proof is on you to support your claim that they could have executed a 7nm process profitably, as opposed to them looking at the data and coming to a rational conclusion that they couldn't.
> The burden of proof is on you to support your claim that they could have executed a 7nm process profitably
Why the fuck I'd have to prove that given that GloFo themselves claimed that they pulled out of it because it'd be unprofitable? Some people in this subthread are very eager to put words into my mouth.
> Imagine canning your 7nm process last minute only few years before the chip shortage.
What?
The chip shortage was a shortage of cheap but inferior 28nm, 40nm, 65nm and 80nm chips that GlobalFoundries was (and still is) well positioned to profit from.
This is a good point, but they weren't necessarily very cheap or inferior; they were just fabricated in larger process nodes, which for analog chips (as many of them were) doesn't imply inferiority.
I suspect the vast majority of chips made today are made at the 12nm process node or coarser, because I know of some surprising examples. Smaller process nodes are critical to the highest-performance digital chips, but a lot of chips aren't those.
I, personally, am not doing much with my phone i wasn't doing 10 years ago, which about lines up with a 16nm process. I think the camera is the only thing I'd really notice.
But that stuff tends to be much lower margin, and while this year you might have the best power/price numbers, next year someone figures out their product is even lower power on some newer fab that is slowly lowering its price and now the competition forces the margin even lower. Repeat until you have some 40 year old fabs and no customers.
MIPS seems like a story of missed (mipsed?) opportunities. If MIPS had really been (or remained) an open architecture, there would have been little need for RISC-V. They had a decade+ head start in terms of tool support and silicon implementation, compressed/16-bit instruction formats, full 64-bit instruction sets, and scaling from embedded systems to HPC.
Well, because there are open-source designs they'd be using. The GD32V microcontroller, for example, uses Nucleisys's BumbleBee, and high-performance chips from several vendors use Brother Honey Badger's Apache-licensed XuanTie C910: https://github.com/XUANTIE-RV/openc910
Companies that are putting down millions for fab runs absolutely pay shitloads of money for it. The cost of design and verification of those components is enormous and that's mostly what you pay for. People have been shipping Andes and SiFive IP for years now. Downloading source dumps for C910 cores is not the hard part.
For most places that kind of high-cost work doesn't make much sense when their product isn't "a CPU", and they also typically have to buy other IP anyway like memory controllers or I/O blocks -- so buying a CPU core isn't that strange in the grand scheme.
It seemed like a good idea in 01981; the purported expansion of MIPS was "Microprocessor without Interlocked Pipeline Stages", although of course it's a pun on "millions of instructions per second". By just omitting the interlock logic necessary to detect branch hazards and putting the responsibility on the compiler, you get a chip that can run faster with less transistors. IBM's 45000-transistor 32-bit RISC "ROMP" was fabbed for use in IBM products that year, which gives you an idea of how precious silicon area was at the time.
Stanford MIPS was extremely influential, which was undoubtedly a major factor in many RISC architectures copying the delay-slot feature, including SPARC, the PA-RISC, and the i860. But the delay slot really only simplifies a particular narrow range of microarchitectures, those with almost exactly the same pipeline structure as the original. If you want to lengthen the pipeline, either you have to add the interlocks back in, or you have to add extra delay slots, breaking binary compatibility. So delay slots fell out of favor fairly quickly in the 80s. Maybe they were never a good tradeoff.
One of the main things pushing people to RISC in the 80s was virtual memory, specifically, the necessity of being able to restart a faulted instruction after a page fault. (See Mashey's masterful explanation of why this doomed the VAX in https://yarchive.net/comp/vax.html.) RISC architectures generally didn't have multiple memory accesses or multiple writes per instruction (ARM being a notable exception), so all the information you needed to restart the failed instruction successfully was in the saved program counter.
But delay slots pose a problem here! Suppose the faulting instruction is the delay-slot instruction following a branch. The next instruction to execute after resuming that one could either be the instruction that was branched to, or the instruction at the address after the delay-slot instruction, depending on whether the branch was taken or not. That means you need to either take the fault before the branch, or the fault handler needs to save at least the branch-taken bit. I've never programmed a page-fault handler for MIPS, the SPARC, PA-RISC, or the i860, so I don't know how they handle this, but it seems like it implies extra implementation complexity of precisely the kind Hennessy was trying to weasel out of.
The WP page also mentions that MIPS had load delay slots, where the datum you loaded wasn't available in the very next instruction. I'm reminded that the Tera MTA actually had a variable number of load delay slots, specified in a field in the load instruction, to allow the compiler to allow as many instructions as it could for the memory reference to come back from RAM over the packet-switching network. (The CPU would then stall your thread if the load took longer than the allotted number of instructions, but the idea was that a compiler that prefetched enough stuff into your thread's huge register set could make such stalls very rare.)
I think program counter is backed up and branch is just re-executed. Though it's annoying if handler wants to skip over faulting instruction (eg. it was a syscall), as it now needs to emulate the branch behavior in software. Most of the complexity is punted on the software, I think only hardware tweak needed is keeping in-delay-slot flag in fault description, and keeping address of currently executing instruction for fault reporting and PC-relative addressing (which probably could be omitted otherwise, keeping only next instruction address would be enough).
Thank you! I guess that, as long as the branch instruction itself can't modify any of the state that would cause it to branch or not, that's a perfectly valid solution. It seems like load delay slots would be more troublesome; I wonder how the MIPS R2000 and R3000 handled that? (I'm not sure the Tera supported virtual memory.)
Load delay slots doesn't seem to need special fault handling support, you're not supposed to depend on old value being there in the delay slot.
One more thing about branch delay slots: It seems original SuperH went for very minimal solution. It prevents interrupts being taken between branch and delay slot, and not much else. PC-relative accesses are relative to the branch target, and faults are also reported with branch target address. As far I can see this makes faults in branch delay slots unrecoverable. In SH-3 they patched that by reporting faults in delay slots for taken branches with branch address itself, so things can be fixed up in the fault handler.
Hmm, I guess that if the load instruction doesn't change anything except the destination register (unlike, for example, postincrement addressing modes) and the delay-slot instruction also can't do anything that would change the effective address being loaded from before it faulted (and can't depend on the old value), then you're right that it wouldn't need any special fault handling support. I'd never tried to think this through before, but it makes sense. I appreciate it.
As for SH2, ouch! So SH2 got pretty badly screwed by delay slots, eh?
Even without faults, some SO answers indicate that on R2000 new value might be available in delay slot if it was a cache miss.
As for SuperH I don't think they cared too much. Primary use of handling faults is memory paging, and MMU was added only in SH-3, so that's probably the reason they also fixed delay slot fault recovery. Before that faults were either illegal opcodes or alignment violations, probably the answer for that was "don't do that".
The new value was available earlier if it was a cache miss?
I didn't remember that the SH2 didn't support virtual memory (perhaps because I've never used SuperH). That makes sense, then.
I think that, for the ways people most commonly use CPUs, it's acceptable if the value you read from a register in a load delay slot is nondeterministic, for example depending on whether you resumed from a page fault or not, or whether you had a cache miss or not. It could really impede debugging if it happened in practice, and it could impede reverse-engineering of malware, but I believe that such things are actually relatively common. (IIRC you could detect the difference between an 8086 and an 8088 by modifying the next instruction in the program, which would have been already loaded by the 8086 but not the 8088. But I'm guessing that under a single-stepping debugger the 8086 would act like an 8088 in this case.) The solution would probably be "Stop lifting your arm like that if it hurts;" it's easy enough to not emit the offending instruction sequences from your compiler in this case.
The case where people really worry about nondeterminism is where it exposes information in a security-violating way, as in Spectre, which isn't even nondeterminism at the register-contents level, just the timing level.
Myself, I have a strong preference for strongly deterministic CPU semantics, and I've been working on a portable strongly deterministic (but not for timing) virtual machine for archival purposes. But clearly strong determinism isn't required for a usable CPU.
>The new value was available earlier if it was a cache miss?
Apparently so. Maybe the logic is that it is available one instruction later if it's a hit, but when it's a miss it's stalls entire pipeline anyway, and resumes only when result is available.
One source of non-determinism that stayed for long time in various architectures were LL/SC linked atomics. It mostly didn't matter but eg. rr recording debugger on AArch64 doesn't work on applications using these instead of newer CAS extension atomics.
WRT LL/SC, I don't think it's dead yet—isn't RISC-V's A extension using a sort of LL/SC? rr is indeed exactly the kind of collateral damage that I deplore. rr is amazing.
Whoa, had no idea this existed. Wild stuff. Might be "somewhat" confusing to read assembler code like that without knowing about this particular technique..
Both register windows and the delay slot exist on SPARC processors, which you’re much more likely to run into in a data center (running open-source software).
Itanium was the really odd one — it not only used register windows but could offload some of the prior windows onto the heap. Most people would probably never notice… unless you’re trying to get a conservative scanning GC working and are stumped why values in some registers seem to not be traced…
'Anyway this chip architect
guy is standing up in front of this group promising the moon and stars. And I finally put my
hand up and said I just could not see how you're proposing to get to those kind of
performance levels. And he said well we've got a simulation, and I thought Ah, ok. That shut
me up for a little bit, but then something occurred to me and I interrupted him again. I said,
wait I am sorry to derail this meeting. But how would you use a simulator if you don't have a
compiler? He said, well that's true we don't have a compiler yet, so I hand assembled my
simulations. I asked "How did you do thousands of line of code that way?" He said “No, I did
30 lines of code”. Flabbergasted, I said, "You're predicting the entire future of this
architecture on 30 lines of hand generated code?" [chuckle], I said it just like that, I did not
mean to be insulting but I was just thunderstruck. Andy Grove piped up and said "we are not
here right now to reconsider the future of this effort, so let’s move on".'
Sun had some funny stories around this too. When they came up with their multi-core system, and they used code from 10-15 years earlier for traces. And then said 'well, nobody actually uses floating code' so we don't need it. Of course over those 10 years Floating point became much more common and stand. Leading to a chip that had one FPU for 8 cores, basically meaning, even minimal floating point would destroy concurrency. Arguably Sun had already lose the chip war and this was just making them fall behind further. They did market it in quite well.
And a lesser known thing that I couldn't find much information on is that Sun also worked on VLIW chip during the 90s. Apparently Bill Joy was convinced that VLIW was the future so they did a VLIW chip, and the project was lead by David Ditzel. As far as I am aware this was never released. If any Sun veterans have any idea about this, I would love to know.
When there is such complaint about closed firmware in the Raspberry Pi, and the risk of the Intel ME and other closed CPU features, I wonder why these open designs are ignored. Yes, the performance and power consumption would be poor by modern standards.
These designed are not ignored. They were used for a few things here and there. But the usefulness of 'over the wall' open code without backing is always a bit limited and for processors that cost 100k to tap out, even more so.
By now there are much better more modern design out-there and for RISC-V.
VLIW is maybe cool, but people will be relieving themselves on EPIC's grave for the pain that it inflicted on them.
Like if you tried to debug a software crash on Itanium. The customer provided core dump was useless as you could not see what was going on. Intel added a debug mode to their compilers which disabled all that EPIC so hopefully you could reproduce the crash there, or on other CPU architectures. Otherwise you were basically screwed.
That HP-Intel arrangement was weird. One time, an Intel-badged employee came out to change a tape drive on a (Compaq->HP->HPE) Compaq SSL2020 tape robot. Okay, I guess they shared employees. ¯\_(ツ)_/¯
I was going to make a reference to Patterson & Hennessy, but it's too bad that the 5th and later editions are hidden behind a DRM paywall. You don't "own" books anymore.
Interesting but complementary foray into owning the end-to-end pipeline of chip design, fabrication, and packaging - especially for embedded use cases.
MIPS has also hitched it's horse to RISC-V now, and I am seeing a critical mass of talent and capital forming in that space.
AFAIK MIPS still hasn't shipped a high-end processor competitive with the XuanTie 910 that article is about. And I think the billions of RISC-V microcontroller cores that have shipped already (10 billion as of 02022 according to https://wccftech.com/x86-arm-rival-risc-v-architecture-ships...) are also mostly not from MIPS.
...and if he does, why does he then consider the year 99999 to be out of reach? As I understand it the idea is to promote "long term thinking" but I really don't see how this affectation is actually supposed to achieve anything beyond mildly irritating/confusing the reader.
At least the Long Now Foundation stuff comes with that context built-in.
off-topic but: I've noticed you prefix years with a zero in your HN comments. First I thought it was just a typo, but I see you've made several comments like that. Is there some significance, or are you just raising awareness of the year 9999 problem?
It doesn't require any special commitment because it doesn't cost me anything. Certain people do post a lot of really boring comments about it, but I'm not the one posting those comments, and I don't care about those people's opinions, so I don't care.
I don't believe I'm actually doing those people any injury, so while they're obviously free to continue requesting different formatting of my posts, I'm free to ignore them.
I think it's important for people to be able to complain about things that bother them, for the reasons described in https://news.ycombinator.com/item?id=44501817. In that thread, we were discussing a different commenter requesting that an author please not use AI for editing his own books, although the request was made in a particularly obnoxious fashion. Consider "Please don't play your music so loud at night", "Please don't look at my sister", or "Please don't throw your trash out your car window". But "please format your dates differently" doesn't seem like a very important request, even if it were phrased politely, to the point that it makes me (and, as I've seen, others) think less of the people who are making it.
If my date formatting really bothers them, they're free to stop reading the site. After having looked at their comment histories, I wish some of them would, because the only thing they ever post are similarly vacuous complaints. If people had to choose between reading a site where I posted and they didn't, and a site where they posted and I didn't, 100% of people would choose the former. (Others do occasionally post something worthwhile, but nothing that inspires me to wonder how I could earn their admiration.)
I suspect it’s counterproductive, though, like deliberately not using pronouns and always referring to someone by name. The intent might be to draw attention to the author’s cause, but it’s more likely to come across that the author just writes weirdly.
Eh, I also think it's harmless, and lends a certain "brand" to their posts - which are usually quite good otherwise. Better to be weird than dull, right?
I guess, unless the offputting:goodness ratio gets lopsided and makes people start ignoring them.
Frankly, something about that leading 0 makes me grit my teeth and stop reading. I can't explain why it affects me like that. Perhaps I'm the only one who does, although threads like this seem to pop up whenever they post so I don't think so. If HN had a mute button, I'd probably use it just because it annoys me to that level.
Edit: And now that we're talking about it, they seem to have the need to mention a specific year way more than most, as though deliberately looking for opportunities to draw attention to themselves. Oof. That just made it about 10x more grating to me.
I do get where you're coming from; for me I think it interrupts the way I scan text - a date would be unconsciously absorbed but these stand out as abnormal artefacts requiring full attention.
Yeah, that's a great way of explaining it. Every time, it raises an exception in my mental parser and I have to go back and consciously evaluate it. Then I see that it's the same person who got me yet again, and I grit my teeth.
It was some time ago that MIPS did announce that they had competitive RISC-V cores and had signed customers for them: LG and in the automotive sector. I'd think those should be taped out by now, but who knows...
I think the C910 looks better on paper than it performs in practice. I hope that isn't the case for MIPS.
Adding to what was said, it also suspiciously looks like a MIPS core with a RISC-V frontend strapped to it sort of like Qualcomm did with their Nuvia AArch64 core. Particularly stuff like the soft fill TLB from m-mode looks just like MIPS coprocessor 0.
There's nothing especially wrong with using an existing backend design and transitioning it to another ISA; a number of teams did that from mips->arm and had success with the result. Of course, if you ship too early you may be missing some features.
> a number of teams did that from mips->arm and had success with the result.
Do you have any examples? Apple Silicon cores took pieces of the pwerficient cores, and everything else I know of either tweaked an official ARM design or started more or less from scratch.
Yeah, my understanding was that shipping a high-performance MIPS core with RISC-V instruction decoding was precisely their plan. It sounded like a pretty good plan, really. But did they manage to actually ship one? Did you get a look at a datasheet?
I can only refer to MIPS' own press releases, unfortunately. They mention 4-wide OoO, RV64GH + Zbb + Zba. no V.
That is a frustrating pattern in the RISC-V world. Many companies that boast having x wide cores with y SPECint numbers but nothing that has been independently verified.
Yes, but their claims over the last few years have been that their RISC-V implementations will be super fast, not like all those pikers, because they're using MIPS microarchitectural techniques. And so far I haven't seen them ship anything that substantiates that.
It's an interesting comparison because MIPS used to occupy the niche that RV does now - an ISA that anyone could implement.
Lots of companies had their own mips implementation, but still might use an implementation from mips-the-company because even if you have your own team, you probably don't want to implement every core size that you might need. But then for some reason lots of them switched to using ARM, within a few years (in some cases getting an architecture licence and keeping their CPU team).
It seems like RV has a more stable structure, as the foundation doesn't licence cores, so even if one or two of the implementors die it won't necessarily reflect on the viability of the ecosystem
Suddenly another company that has (old?) fabs and a cpu design team in-house
This could be interesting to see how much they try to loss-lead to get market share in the low-end
GF's fabs aren't that old. They were neck-and-neck with TSMC until 02018, when they could do 12nm: https://web.archive.org/web/20190107061855/https://www.v3.co...
Imagine canning your 7nm process last minute only few years before the chip shortage.
Must be the most moronic decision ever.
and it's not like 20/20 hindsight either, because every hardware enthusiast knew at the time Intel was having troubles and was worried TSMC (and Samsung at the time) were going to be the only fabs producing leading edge lithographies.
I think it would require some work to call it a “moronic decision.” My suspicion is that even if they could see the future and predict that shortage, 7nm by 2020/2021 was not on the table for them.
These nm values are really bullshit anyway, but the tech node that was supposed to be Intel’s 7nm, which ended up being called “Intel 4” (because they branded some 10nm tech as Intel 7), only came out in like 2023. Given they Global Foundries was always behind Intel, suddenly leapfrogging them by 2-3 years would be quite a feat.
Oh no, it is a moronic decision and everyone thought so even then. It was a competitive process, they said volume production was due in late 2018 and they canned it at the very last minute citing it financially not feasible. You can read details at this news article (https://www.anandtech.com/show/13277/globalfoundries-stops-a...) or thousands of forum discussions regarding the news. No need to even look that far, just skimp the discussions on the forum topic below the news article I linked and it was plain as a day to anyone what would happen.
> These nm values are really bullshit anyway, but the tech node that was supposed to be Intel’s 7nm, which ended up being called “Intel 4” (because they branded some 10nm tech as Intel 7), only came out in like 2023. Given they Global Foundries was always behind Intel, suddenly leapfrogging them by 2-3 years would be quite a feat.
This is a very weak argument. Intel was ahead of everyone, now everyone is ahead of Intel. Remember TSMC's blunder processes like 20nm? How they turned around after that? Or how GloFo has had always mediocre processes but they finally hit the nail in the head with their 14/12nm? Fab business has always had companies leapfrogging each other, it turns out the worst sin is not trying. GloFo's greedy investors chose to bury the business in the ground for their short term profits.
I thought it was a bad decision at the time, but it does seem like a defensible one to me, for three reasons.
First, nobody knew if even TSMC was going to succeed at bringing a 7nm process to market. 02018 was maybe the height of the "Moore's Law is over" belief. There was a lot of debate about whether planar semiconductor scaling had finally reached the limit of practical feasibility, although clearly it was still two orders of magnitude from the single-atom physical limit, which had been reached by Xie's lab in 02002. Like Intel, SMIC didn't reach 7nm until 02023 (with the HiSilicon processor for Huawei's Mate60 cellphone) despite having the full backing of the world's most technically productive country, and when they did, it was a shocking surprise in international relations with the US.
Second, even if GF had brought 7nm to market, there was no guarantee it would be profitable. The most profitable companies in a market are not always the most technically advanced; often the pioneers die with arrows in their backs. If you can make 7nm chips in volume, but the price for them is so high that almost everyone sticks with 12nm processes (maybe from your competitors), you can still lose money on the R&D. Moore's Law as originally stated in "Cramming" was about how the minimum price per transistor kept moving to smaller and smaller transistors, and historically that has been an immensely strong impetus to move to smaller processes, but it's clearly weakened in recent years, with many successful semiconductor products like high-end FPGAs still shipping on very old process nodes. (Leaving aside analog, which is a huge market that doesn't benefit from smaller feature size.)
Third, we don't know what the situation inside GF was, and maybe GF's CEO did. Maybe they'd just lost all their most important talent to TSMC or Samsung, so their 7nm project was doomed. Maybe their management politics were internally dysfunctional in a way that blocked progress on 7nm, even if it hadn't been canceled. There's no guarantee that GF would have been successful at mass production of 7nm chips even in a technical sense, no matter how much money they spent on it.
In the end it seems like GF lost the bet pretty badly. But that doesn't necessarily imply that it was the wrong bet. Just, probably.
As far as I know, Global Foundries ceased efforts at 7nm and lower because they could not afford it.
They had previously signed a contract with IBM to produce silicon at these more advanced nodes that they could not honor, and there was legal action between them.
https://www.anandtech.com/show/13277/globalfoundries-stops-a...
https://newsroom.ibm.com/2025-01-02-GlobalFoundries-and-IBM-...
Yeah and IBM had to move their designs in the last minute from GF to Samsung. I have heard that the Samsung process was much better and the tech transfer was easier than expected.
First, you point out that Moore's law was about the transistor count per chip at the optimum cost process, and that's very important. We have transitioned from a more-for-less leading edge to a more-for-more leading edge. It's overall sensible for Apple to build giant chips on the newest processor not because it's cheaper but because it gives them an overall more competitive product (they only sell whole devices). Just because Apple and Nvidia keep making bigger chips doesn't mean that Moore's law is working the way it was originally proposed (Intel's marketing department notwithstanding).
In any case, at the time and still I think GF was probably correct in that they would not be able to compete at the leading edge and make money at it. Remember, AMD and IBM separated fabs out for a reason and not having the scale necessary to compete was probably a big part of that. AMD has succeeded on TSMC and IBM seems to be doing ok on Samsung. Most chips are not at the leading edge and don't need to be, and so most fabs don't need to be leading edge to serve customers. There are all kinds of applications where a more mature and better characterized process is better, whether for harsh environments, mixed signal applications, or just low volume parts where $20M of tooling cost is not worth it.
Its odd all these MBAs and few in the tech space appear to know that when a technology company stops investing in the future they are done. It might take 20+ years for that to happen but it will. Sure, stretch the timeline for the next node/product/etc but _NEVER_ stop pushing the enveloper because if you can't invest in it now, you won't be able to in a few years time when your resources are even more constrained as your customer base dwindles, or your technology becomes more commoditized or simply left behind as companies that did invest no longer have a need for their older products/lines.
> It was a competitive process
Do you have any evidence, besides GF's own PR/IR department, that the process ever actually worked in volume? Because from my point of view, how they ended things looks exactly how I would spin away a multibillion-dollar investment into a failed process.
No I don't, but then, how bad it could be? As bad as Samsung's 8nm? Or Intel's 10nm? Even they delivered something in the end. What did GF deliver? A whole fucking nothing. Samsung had Nvidia and Qualcomm as their customers even with its, ehm, not so good 8nm process. It was a sure bet GF was going to have some customers as long as they delivered something (and I don't even count AMD's wafer supply agreement).
It could be arbitrarily bad. 1% yields, 0.01% yields, 0.00001% yields. Having to write each wafer with an electron beam because they couldn't get EUV to work at 7nm.
It could be, but on the other hand it could be freaking fantastic, too. The only way we'd know if they've fucking did it, which is my point.
Since it didn't happen, the only thing we know is what they said and they said it was because of "strategic shift"
> Tom Caulfield also mentioned GF needed $3 billion dollars of additional capital to get to 12,000 wpm and they could only fund half of it through cash flow, they would have to borrow the other half and the projected return wasn’t good.
> When Tom took over as CEO he went out on the road and visited GF’s customers. What he found was a lack of commitment to GF’s 7 nm process in the customer base. Many customers were never going to go to 7 nm and of the customers who were, GF wouldn’t have enough capacity to meet their demands. There was also concern in the customer base that 7 nm would take up all the R&D and capital budgets and starve the other processes they wanted to use of investment.
(https://semiwiki.com/wikis/company-wikis/globalfoundries-wik...)
I think the proof of the pudding is in the eating: it's been seven years since the cancellation of 7LP. They have launched nothing even near the leading edge since 12LP+.
If 7LP worked, given this market and its hunger for capacity, it'd be in production at at least small scale. Equipment costs are down and knowledge has disseminated, making it a lot cheaper to launch, especially as "7nm" isn't the leading edge any more.
I don't think it works.
> I don't think it works.
Duh. Of course it doesn't work, because they cancelled it in 2018!
> making it a lot cheaper to launch, especially as "7nm" isn't the leading edge any more.
Same logic cuts both ways. If they didn't think it was financially viable in 2018 when it'd be a leading edge process and their customers would be willing to pay top dollar for it, why would they think it'd be feasible now when it isn't the leading edge lithography and nobody would be paying top dollar for it?
On top of that I doubt even your claim that it'd be cheaper to do the investment now would hold given how everything got more expensive since 2018. I'm also doubtful that machines got cheaper since ASML is still the only ones building them and they've probably got their hands full with their existing customers. They'd probably laugh at GloFo if they'd come with a request like that "Sorry GloFo, we're already booked until 2030 building machines for TSMC, Intel and Samsung maybe try at 2032" :P
GloFo got off the train and there's no going back.
>> Fab business has always had companies leapfrogging each other, it turns out the worst sin is not trying. GloFo's greedy investors chose to bury the business in the ground for their short term profits.
Name company making chips with EUV that is not TSMC, Samsung, or Intel?
3 company managed to do it, was there a law forbidding a 4th one from doing so?
GlobalFoundries didn't design their own 14/12nm process it was licensed from Samsung.
that's beside the point. The point is they executed it pretty well.
Might not be the investors, might be squarely management's fault. A lot of investors are pretty passive.
What I remember from discussions at the time was they were going to tape out very soon and start building for mass production, then UAE fund noped when things got serious.
It does mean GF is on the path to long slow decline. The decision was not "we will wait 5-10 years" but "we will not develop any new processes".
I don't fault them for failing to predict the chip shortage and huge opportunity to acquire customers that would result. The fact remains: they will eventually fade away.
Yeah, that was my reading at the time. But lots of companies have gone that direction, closing down major lines of business because they couldn't make money at them anymore. I mean, you probably remember this, but IBM used to make computers. Intel started out making RAM. HP used to make working products.
Intel changed their naming to reflect TSMC’s, as Intel 10nm had transistor densities close to TSMC’s 7nm.
That was a huge gift to AMD since it let them use TSMC as for fabrication instead, and they gained a process node advantage over Intel for the first time in history.
My guess is that the guys in Abu Dhabi did not want to do the investments needed to bring 7nm into production. They lost a huge opportunity because of that. At the time, it probably looked like the right financial decision to them, even though practically everyone affected downstream thought it was myopic.
Intel struggled for years with their 7nm process to the point where they are now fabbing their latest ICs at TSCM.
Pursuing 7nm would have likely bankrupted GloFo.
IIRC, that's because Intel attempted it without EUV to save a handful of dollars.
>Imagine canning your 7nm process last minute only few years before the chip shortage.
https://www.eetimes.com/samsung-globalfoundries-prep-14nm-pr...
"Samsung expects to be in production late this year with a 14 nm FinFET process it has developed. GlobalFoundries has licensed the process and will have it in production early next year."
GlobalFoundries licensed 14nm from Samsung. How do you know GlobalFoundries is capable of 7nm?
I know that, but I've brought it up anyway. It's irrelevant who they've licenced it from because they executed it god damn well.
This was from 02014, btw.
btw, what's with the leading zero here?
It's a meme that's supposed to get people to think in >4-digit timescales, apparently. Always makes me think of octal TBH
I think they've over-corrected from the two digits are enough truncation that was common in computers between the 1950s and ~2000. It started to become less common then, but the phase out is arguably still going or stalled until things just die.
However OCTAL (leading zero) prefixing of a text mode number fails on a number of points:
* It's still a fixed register size (5 characters), which will overflow on the year 100000 AD.
* It's confusing, everyone else.
* It's not technically correct. (human behavior)
Truncating to two year digits was confusing because ambiguity. There is no ambiguity if a number encoded in decimal uses precisely the number of characters it needs. That's how normal humans normally write numbers.
He's talking about AD 1036. Try to keep up
It's there to provoke your question
Indeed. Consider it trolling, ignore it. It's just stupid.
Best to just downvote it then.
It's "Long Now" stuff, which really should be called "Medium Now" because they're only using one leading zero.
kragen thinks making most of his readers glitch for a second every time they read one of his dates is worth it on order to advertise for the Long Now. Really unfortunate choice, since he often has decent information to share.
that's my point. how does OP know GlobalFoundries is capable of 7nm if they can't even do 14nm. do you have any insider info that you can share?
They signed a contract with IBM to manufacture their chips at 7nm (and reneged).
I agree, and I wrote a longer comment agreeing with your point at https://news.ycombinator.com/item?id=44503245.
No I don't have insider info. Neither do you. What an ridiculous nit to pick.
You're the one claiming it's the "most moronic decision ever".
The burden of proof is on you to support your claim that they could have executed a 7nm process profitably, as opposed to them looking at the data and coming to a rational conclusion that they couldn't.
Right, it seems like kasabali is making claims that are considerably more absolute than they would be able to justify without insider info.
> The burden of proof is on you to support your claim that they could have executed a 7nm process profitably
Why the fuck I'd have to prove that given that GloFo themselves claimed that they pulled out of it because it'd be unprofitable? Some people in this subthread are very eager to put words into my mouth.
> Imagine canning your 7nm process last minute only few years before the chip shortage.
What?
The chip shortage was a shortage of cheap but inferior 28nm, 40nm, 65nm and 80nm chips that GlobalFoundries was (and still is) well positioned to profit from.
This is a good point, but they weren't necessarily very cheap or inferior; they were just fabricated in larger process nodes, which for analog chips (as many of them were) doesn't imply inferiority.
In terms of years, not that old.
In terms of process, it's ancient.
I suspect the vast majority of chips made today are made at the 12nm process node or coarser, because I know of some surprising examples. Smaller process nodes are critical to the highest-performance digital chips, but a lot of chips aren't those.
Eh if humanity had to revert to 12nm nothing would noticeably get worse
Cellphones would, servers would, not much else.
I, personally, am not doing much with my phone i wasn't doing 10 years ago, which about lines up with a 16nm process. I think the camera is the only thing I'd really notice.
They decided to pivot to innovation that does not require extreme CMOS scaling. For example, they focussed heavily on ultra-low-power SOI at 28nm.
Keep in mind that your iphone only has very few chips in <10nm technology. The rest is using much larger groundrules, even the memory.
But that stuff tends to be much lower margin, and while this year you might have the best power/price numbers, next year someone figures out their product is even lower power on some newer fab that is slowly lowering its price and now the competition forces the margin even lower. Repeat until you have some 40 year old fabs and no customers.
Consider also that 28nm planar transistors are more durable than FINFET, especially in the dissipation of heat.
The automobile industry showed us that there is demand for older nodes.
90 % of chip demand is in older nodes anyway.
>> Suddenly another company that has (old?) fabs and a cpu design team in-house
Glo-flo is leading edge for anyone without EUV.
SMIC is someone without EUV who is shipping 7nm for two years now.
Not having euv means you have old fabs.
MIPS seems like a story of missed (mipsed?) opportunities. If MIPS had really been (or remained) an open architecture, there would have been little need for RISC-V. They had a decade+ head start in terms of tool support and silicon implementation, compressed/16-bit instruction formats, full 64-bit instruction sets, and scaling from embedded systems to HPC.
That whacky multiplier setup though. Who could've ever foreseen that bottleneck.
How are the various riscv cpu IP vendors generally doing financially?
Is this the very beginning of a market consolidation?
Esperanto just shut down. https://www.eetimes.com/ai-startup-esperanto-winds-down-sili...
There are a lot of different CPU IP vendors working on RISC-V. China's a big source of it, and I shouldn't have to explain why.
I don't think people generally pay for RISC-V CPU IP.
For ISA? Certainly not. For actual designs, for sure. Why wouldn't they unless there's some open source designs they'd be using?
Well, because there are open-source designs they'd be using. The GD32V microcontroller, for example, uses Nucleisys's BumbleBee, and high-performance chips from several vendors use Brother Honey Badger's Apache-licensed XuanTie C910: https://github.com/XUANTIE-RV/openc910
But see https://news.ycombinator.com/item?id=44503847
There are open source designs. Here is one:
https://github.com/riscv-boom/riscv-boom
Companies that are putting down millions for fab runs absolutely pay shitloads of money for it. The cost of design and verification of those components is enormous and that's mostly what you pay for. People have been shipping Andes and SiFive IP for years now. Downloading source dumps for C910 cores is not the hard part.
For most places that kind of high-cost work doesn't make much sense when their product isn't "a CPU", and they also typically have to buy other IP anyway like memory controllers or I/O blocks -- so buying a CPU core isn't that strange in the grand scheme.
Thank you very much!
They do if they aren't implementing the ISA in silicon themselves. Its interesting to see who's designs are selling, who's aren't and why.
Sure they do, most IP is proprietary
AMD's former girlfriend now married MIPS. Yikes
Put that in your delay slot and smoke it.
https://en.wikipedia.org/wiki/Delay_slot
I'm surprised by how many other architectures use it.
It seemed like a good idea in 01981; the purported expansion of MIPS was "Microprocessor without Interlocked Pipeline Stages", although of course it's a pun on "millions of instructions per second". By just omitting the interlock logic necessary to detect branch hazards and putting the responsibility on the compiler, you get a chip that can run faster with less transistors. IBM's 45000-transistor 32-bit RISC "ROMP" was fabbed for use in IBM products that year, which gives you an idea of how precious silicon area was at the time.
Stanford MIPS was extremely influential, which was undoubtedly a major factor in many RISC architectures copying the delay-slot feature, including SPARC, the PA-RISC, and the i860. But the delay slot really only simplifies a particular narrow range of microarchitectures, those with almost exactly the same pipeline structure as the original. If you want to lengthen the pipeline, either you have to add the interlocks back in, or you have to add extra delay slots, breaking binary compatibility. So delay slots fell out of favor fairly quickly in the 80s. Maybe they were never a good tradeoff.
One of the main things pushing people to RISC in the 80s was virtual memory, specifically, the necessity of being able to restart a faulted instruction after a page fault. (See Mashey's masterful explanation of why this doomed the VAX in https://yarchive.net/comp/vax.html.) RISC architectures generally didn't have multiple memory accesses or multiple writes per instruction (ARM being a notable exception), so all the information you needed to restart the failed instruction successfully was in the saved program counter.
But delay slots pose a problem here! Suppose the faulting instruction is the delay-slot instruction following a branch. The next instruction to execute after resuming that one could either be the instruction that was branched to, or the instruction at the address after the delay-slot instruction, depending on whether the branch was taken or not. That means you need to either take the fault before the branch, or the fault handler needs to save at least the branch-taken bit. I've never programmed a page-fault handler for MIPS, the SPARC, PA-RISC, or the i860, so I don't know how they handle this, but it seems like it implies extra implementation complexity of precisely the kind Hennessy was trying to weasel out of.
The WP page also mentions that MIPS had load delay slots, where the datum you loaded wasn't available in the very next instruction. I'm reminded that the Tera MTA actually had a variable number of load delay slots, specified in a field in the load instruction, to allow the compiler to allow as many instructions as it could for the memory reference to come back from RAM over the packet-switching network. (The CPU would then stall your thread if the load took longer than the allotted number of instructions, but the idea was that a compiler that prefetched enough stuff into your thread's huge register set could make such stalls very rare.)
I think program counter is backed up and branch is just re-executed. Though it's annoying if handler wants to skip over faulting instruction (eg. it was a syscall), as it now needs to emulate the branch behavior in software. Most of the complexity is punted on the software, I think only hardware tweak needed is keeping in-delay-slot flag in fault description, and keeping address of currently executing instruction for fault reporting and PC-relative addressing (which probably could be omitted otherwise, keeping only next instruction address would be enough).
Thank you! I guess that, as long as the branch instruction itself can't modify any of the state that would cause it to branch or not, that's a perfectly valid solution. It seems like load delay slots would be more troublesome; I wonder how the MIPS R2000 and R3000 handled that? (I'm not sure the Tera supported virtual memory.)
Load delay slots doesn't seem to need special fault handling support, you're not supposed to depend on old value being there in the delay slot.
One more thing about branch delay slots: It seems original SuperH went for very minimal solution. It prevents interrupts being taken between branch and delay slot, and not much else. PC-relative accesses are relative to the branch target, and faults are also reported with branch target address. As far I can see this makes faults in branch delay slots unrecoverable. In SH-3 they patched that by reporting faults in delay slots for taken branches with branch address itself, so things can be fixed up in the fault handler.
Hmm, I guess that if the load instruction doesn't change anything except the destination register (unlike, for example, postincrement addressing modes) and the delay-slot instruction also can't do anything that would change the effective address being loaded from before it faulted (and can't depend on the old value), then you're right that it wouldn't need any special fault handling support. I'd never tried to think this through before, but it makes sense. I appreciate it.
As for SH2, ouch! So SH2 got pretty badly screwed by delay slots, eh?
Even without faults, some SO answers indicate that on R2000 new value might be available in delay slot if it was a cache miss.
As for SuperH I don't think they cared too much. Primary use of handling faults is memory paging, and MMU was added only in SH-3, so that's probably the reason they also fixed delay slot fault recovery. Before that faults were either illegal opcodes or alignment violations, probably the answer for that was "don't do that".
The new value was available earlier if it was a cache miss?
I didn't remember that the SH2 didn't support virtual memory (perhaps because I've never used SuperH). That makes sense, then.
I think that, for the ways people most commonly use CPUs, it's acceptable if the value you read from a register in a load delay slot is nondeterministic, for example depending on whether you resumed from a page fault or not, or whether you had a cache miss or not. It could really impede debugging if it happened in practice, and it could impede reverse-engineering of malware, but I believe that such things are actually relatively common. (IIRC you could detect the difference between an 8086 and an 8088 by modifying the next instruction in the program, which would have been already loaded by the 8086 but not the 8088. But I'm guessing that under a single-stepping debugger the 8086 would act like an 8088 in this case.) The solution would probably be "Stop lifting your arm like that if it hurts;" it's easy enough to not emit the offending instruction sequences from your compiler in this case.
The case where people really worry about nondeterminism is where it exposes information in a security-violating way, as in Spectre, which isn't even nondeterminism at the register-contents level, just the timing level.
Myself, I have a strong preference for strongly deterministic CPU semantics, and I've been working on a portable strongly deterministic (but not for timing) virtual machine for archival purposes. But clearly strong determinism isn't required for a usable CPU.
>The new value was available earlier if it was a cache miss?
Apparently so. Maybe the logic is that it is available one instruction later if it's a hit, but when it's a miss it's stalls entire pipeline anyway, and resumes only when result is available.
One source of non-determinism that stayed for long time in various architectures were LL/SC linked atomics. It mostly didn't matter but eg. rr recording debugger on AArch64 doesn't work on applications using these instead of newer CAS extension atomics.
Oh, that makes sense.
WRT LL/SC, I don't think it's dead yet—isn't RISC-V's A extension using a sort of LL/SC? rr is indeed exactly the kind of collateral damage that I deplore. rr is amazing.
Or "make interlocks programmed in software". But later MIPS versions had hardware interlocks I believe.
Some kernel somewhere:
Switch(mipsarch): Case 1: Nop.
Case 2: Noop.
Case 10: Noooooooooop.
Whoa, had no idea this existed. Wild stuff. Might be "somewhat" confusing to read assembler code like that without knowing about this particular technique..
Many assemblers had an option to reorder on assembly so you could write it normally, while only taking care to avoid hazards near branches.
At least one toolchain would just pad the slots with nops
Allow me to introduce you to register windows.
https://www.jwhitham.org/2016/02/risc-instruction-sets-i-hav...
Both register windows and the delay slot exist on SPARC processors, which you’re much more likely to run into in a data center (running open-source software).
Itanium was the really odd one — it not only used register windows but could offload some of the prior windows onto the heap. Most people would probably never notice… unless you’re trying to get a conservative scanning GC working and are stumped why values in some registers seem to not be traced…
Pour one out for Itanium. It tried to make the panacea of VLIW and branch hints work, but it didn't pan out.
From an interview with Bob Colwell:
'Anyway this chip architect guy is standing up in front of this group promising the moon and stars. And I finally put my hand up and said I just could not see how you're proposing to get to those kind of performance levels. And he said well we've got a simulation, and I thought Ah, ok. That shut me up for a little bit, but then something occurred to me and I interrupted him again. I said, wait I am sorry to derail this meeting. But how would you use a simulator if you don't have a compiler? He said, well that's true we don't have a compiler yet, so I hand assembled my simulations. I asked "How did you do thousands of line of code that way?" He said “No, I did 30 lines of code”. Flabbergasted, I said, "You're predicting the entire future of this architecture on 30 lines of hand generated code?" [chuckle], I said it just like that, I did not mean to be insulting but I was just thunderstruck. Andy Grove piped up and said "we are not here right now to reconsider the future of this effort, so let’s move on".'
https://www.sigmicro.org/media/oralhistories/colwell.pdf
Sun had some funny stories around this too. When they came up with their multi-core system, and they used code from 10-15 years earlier for traces. And then said 'well, nobody actually uses floating code' so we don't need it. Of course over those 10 years Floating point became much more common and stand. Leading to a chip that had one FPU for 8 cores, basically meaning, even minimal floating point would destroy concurrency. Arguably Sun had already lose the chip war and this was just making them fall behind further. They did market it in quite well.
And a lesser known thing that I couldn't find much information on is that Sun also worked on VLIW chip during the 90s. Apparently Bill Joy was convinced that VLIW was the future so they did a VLIW chip, and the project was lead by David Ditzel. As far as I am aware this was never released. If any Sun veterans have any idea about this, I would love to know.
As far as the single FPU that you mention, the T1 is an open-source CPU.
https://www.oracle.com/servers/technologies/opensparc-t1-pag...
The T2 is also open, and places an FPU in each core.
https://www.oracle.com/servers/technologies/opensparc-t2-pag...
When there is such complaint about closed firmware in the Raspberry Pi, and the risk of the Intel ME and other closed CPU features, I wonder why these open designs are ignored. Yes, the performance and power consumption would be poor by modern standards.
These designed are not ignored. They were used for a few things here and there. But the usefulness of 'over the wall' open code without backing is always a bit limited and for processors that cost 100k to tap out, even more so.
By now there are much better more modern design out-there and for RISC-V.
VLIW is maybe cool, but people will be relieving themselves on EPIC's grave for the pain that it inflicted on them.
Like if you tried to debug a software crash on Itanium. The customer provided core dump was useless as you could not see what was going on. Intel added a debug mode to their compilers which disabled all that EPIC so hopefully you could reproduce the crash there, or on other CPU architectures. Otherwise you were basically screwed.
EPIC :nauseous face emoji:
That HP-Intel arrangement was weird. One time, an Intel-badged employee came out to change a tape drive on a (Compaq->HP->HPE) Compaq SSL2020 tape robot. Okay, I guess they shared employees. ¯\_(ツ)_/¯
I was going to make a reference to Patterson & Hennessy, but it's too bad that the 5th and later editions are hidden behind a DRM paywall. You don't "own" books anymore.
The TI C40 used them.
SPIM says "all shall be efficient single cycle instructions and to heck with the MHz wars!" /s
Interesting but complementary foray into owning the end-to-end pipeline of chip design, fabrication, and packaging - especially for embedded use cases.
MIPS has also hitched it's horse to RISC-V now, and I am seeing a critical mass of talent and capital forming in that space.
The critical mass of talent and capital forming in the RISC-V space happened in 02019 at Alibaba: https://www.cnx-software.com/2019/07/27/alibaba-unveils-xuan...
AFAIK MIPS still hasn't shipped a high-end processor competitive with the XuanTie 910 that article is about. And I think the billions of RISC-V microcontroller cores that have shipped already (10 billion as of 02022 according to https://wccftech.com/x86-arm-rival-risc-v-architecture-ships...) are also mostly not from MIPS.
(BTW why do you write years with a leafing zero? Do you expect these post to still matter past year 9999?)
...and if he does, why does he then consider the year 99999 to be out of reach? As I understand it the idea is to promote "long term thinking" but I really don't see how this affectation is actually supposed to achieve anything beyond mildly irritating/confusing the reader.
At least the Long Now Foundation stuff comes with that context built-in.
https://longnow.org/
Good point, I will start the Longer Now foundation and start adding two zeroes to the front of all my years.
This whole line of conversation and use of the leading zero reminds me of The Church of MOO from the old internet.
Huh, I know about Bob Dobson et al, but MOO-ism somehow escaped my notice back in the day.
http://textfiles.com/occult/MOOISM/
I have some retro-reading to do.
off-topic but: I've noticed you prefix years with a zero in your HN comments. First I thought it was just a typo, but I see you've made several comments like that. Is there some significance, or are you just raising awareness of the year 9999 problem?
It’s the Long Now Foundation’s convention - a bit cultish but harmless.
https://longnow.org/ideas/long-now-years-five-digit-dates-an...
I think that's some "Long Now Foundation" meme.
That. Personally I think it's performative nonsense, but you have to admire the commitment to it.
It doesn't require any special commitment because it doesn't cost me anything. Certain people do post a lot of really boring comments about it, but I'm not the one posting those comments, and I don't care about those people's opinions, so I don't care.
I don't believe I'm actually doing those people any injury, so while they're obviously free to continue requesting different formatting of my posts, I'm free to ignore them.
I think it's important for people to be able to complain about things that bother them, for the reasons described in https://news.ycombinator.com/item?id=44501817. In that thread, we were discussing a different commenter requesting that an author please not use AI for editing his own books, although the request was made in a particularly obnoxious fashion. Consider "Please don't play your music so loud at night", "Please don't look at my sister", or "Please don't throw your trash out your car window". But "please format your dates differently" doesn't seem like a very important request, even if it were phrased politely, to the point that it makes me (and, as I've seen, others) think less of the people who are making it.
If my date formatting really bothers them, they're free to stop reading the site. After having looked at their comment histories, I wish some of them would, because the only thing they ever post are similarly vacuous complaints. If people had to choose between reading a site where I posted and they didn't, and a site where they posted and I didn't, 100% of people would choose the former. (Others do occasionally post something worthwhile, but nothing that inspires me to wonder how I could earn their admiration.)
So, these days, I can easily ignore it.
Especially the day after this comment of mine got voted up to +151: https://news.ycombinator.com/item?id=44491713
Every time I encounter it, I think to myself: ah, that guy again. Always brings a little smile to my face. Keep it up!
Thanks!
> I don't care about those people's opinions, so I don't care.
This was stronger before you edited in the longer paragraphs telling us that you look through their comment histories.
Look, you do you, but I'd rather hear your passionate promotion of the leading zero approach since you apparently know it irritates people.
Edit: PS Isn't it interesting that the comment that you are justly proud of doesn't have any dates in it? :)
I suspect it’s counterproductive, though, like deliberately not using pronouns and always referring to someone by name. The intent might be to draw attention to the author’s cause, but it’s more likely to come across that the author just writes weirdly.
Eh, I also think it's harmless, and lends a certain "brand" to their posts - which are usually quite good otherwise. Better to be weird than dull, right?
idk, i personally was confused by it; i thought they were referencing some node number or referencing some other thing and got hung up on it....
I guess, unless the offputting:goodness ratio gets lopsided and makes people start ignoring them.
Frankly, something about that leading 0 makes me grit my teeth and stop reading. I can't explain why it affects me like that. Perhaps I'm the only one who does, although threads like this seem to pop up whenever they post so I don't think so. If HN had a mute button, I'd probably use it just because it annoys me to that level.
Edit: And now that we're talking about it, they seem to have the need to mention a specific year way more than most, as though deliberately looking for opportunities to draw attention to themselves. Oof. That just made it about 10x more grating to me.
I do get where you're coming from; for me I think it interrupts the way I scan text - a date would be unconsciously absorbed but these stand out as abnormal artefacts requiring full attention.
Yeah, that's a great way of explaining it. Every time, it raises an exception in my mental parser and I have to go back and consciously evaluate it. Then I see that it's the same person who got me yet again, and I grit my teeth.
You aren’t the only one, it interrupts the flow when I’m reading.
[flagged]
It was some time ago that MIPS did announce that they had competitive RISC-V cores and had signed customers for them: LG and in the automotive sector. I'd think those should be taped out by now, but who knows...
I think the C910 looks better on paper than it performs in practice. I hope that isn't the case for MIPS.
Do you have any details?
Adding to what was said, it also suspiciously looks like a MIPS core with a RISC-V frontend strapped to it sort of like Qualcomm did with their Nuvia AArch64 core. Particularly stuff like the soft fill TLB from m-mode looks just like MIPS coprocessor 0.
There's nothing especially wrong with using an existing backend design and transitioning it to another ISA; a number of teams did that from mips->arm and had success with the result. Of course, if you ship too early you may be missing some features.
> a number of teams did that from mips->arm and had success with the result.
Do you have any examples? Apple Silicon cores took pieces of the pwerficient cores, and everything else I know of either tweaked an official ARM design or started more or less from scratch.
Yeah, my understanding was that shipping a high-performance MIPS core with RISC-V instruction decoding was precisely their plan. It sounded like a pretty good plan, really. But did they manage to actually ship one? Did you get a look at a datasheet?
No datasheet, but their commits to opensbi have been extremely enlightening wrt s and m modes.
Aha! I never would have thought to look there!
I can only refer to MIPS' own press releases, unfortunately. They mention 4-wide OoO, RV64GH + Zbb + Zba. no V.
That is a frustrating pattern in the RISC-V world. Many companies that boast having x wide cores with y SPECint numbers but nothing that has been independently verified.
No V sounds like a bad sign for performance. Do they have any part numbers?
> AFAIK MIPS still hasn't shipped a high-end processor competitive with the XuanTie 910 that article is about
The last high end MIPS was in the SGI times, 30 years ago.
Yes, but their claims over the last few years have been that their RISC-V implementations will be super fast, not like all those pikers, because they're using MIPS microarchitectural techniques. And so far I haven't seen them ship anything that substantiates that.
Loongson was making them until recently.
https://www.theregister.com/2025/05/06/loongson_inspur_cloud...
Yes, but MIPS wasn't.
It's an interesting comparison because MIPS used to occupy the niche that RV does now - an ISA that anyone could implement.
Lots of companies had their own mips implementation, but still might use an implementation from mips-the-company because even if you have your own team, you probably don't want to implement every core size that you might need. But then for some reason lots of them switched to using ARM, within a few years (in some cases getting an architecture licence and keeping their CPU team).
It seems like RV has a more stable structure, as the foundation doesn't licence cores, so even if one or two of the implementors die it won't necessarily reflect on the viability of the ecosystem
> an ISA that anyone could implement.
You want to burn your initial capital on lawyers? This is MIPS we're talking about.