xAI's Grok 3 comes to Microsoft Azure

93 points by mfiguiere 11 hours ago

scuol 9 hours ago

It still seems to have the problems most other LLMs suffer with except Gemini: it loses context so quickly.

I asked it about a paper I was looking at (SLOG [0]) and it basically lost the context of what "slog" referred to after 3 prompts.

1. I asked for an example transaction illustrating the key advantages of the SLOG approach. It responded with some general DB transaction stuff.

2. I then said "no use slog like we were talking about" and then it gave me a golang example using the log/slog package

Even without the weird political things around Grok, it just isn't that good.

[0] https://www.vldb.org/pvldb/vol12/p1747-ren.pdf

jampa 9 hours ago

Honestly, Grok's technology is not impressive at all, and I wonder why anyone would use it:

- Gemini is state-of-the-art for most tasks

- ChatGPT has the best image generation

- Claude is leading in coding solutions

- Deepseek is getting old but it is open-source

- Qwen has impressive lightweight models.

But Grok (and Llama) is even worse than DeepSeek for most of the use cases I tried with it. The only thing it has going for is money behind its infamous founders. Other than that, their existence would be barely acknowledged.

dilap 9 hours ago

I like it! For me it has replaced Sonnet (3.5 at the time, but 3.7 doesn't seem better to me, from my brief tests) for general web usage -- fast, the ability to query x nee twitter is very nice, & I find the code it produces tends to be a bit better than Sonnet. (Though perhaps that depends a lot on the domain...I'm doing mostly C# in Unity.)
For tough queries o3 is unmatched in my experience.
jbellis an hour ago

Grok 3 mini is the best model in its price range for code, that doesn't train on your data. So it's part of Brokk's free plan. https://brokk.ai
- bigyabai 40 minutes ago
  
  > that doesn't train on your data.
  Don't say that for sure unless you're inferencing it on your own machine.
  - CobrastanJorji 24 minutes ago
    
    You don't trust Elon Musk at his word?
t1amat 6 hours ago

Llama is arguably the reason open weight LLM’s are a thing, with the leak of Llama 1 and subsequent release of Llama 2. Llama 3 was a huge push for quality, size, context length, and multi-modality. Llama 4 Maverick is clearly better than it looks if a fine tune can put it at the top of LMArena human preferences leaderboard.
Grok 3 mini is quite a decent agentic model and competitive with frontier models at a fraction of the cost; see livebench.ai.
Zambyte 6 hours ago

The only interesting thing about Grok is using it hooked up to the X firehose to query about events in real time. Unfortunately it sucks at that.
adrr 23 minutes ago

At least two times they had unauthorized changes to their prompts to inject far right content that showed up on random content. imagine you're using it for a chat bot and it starts spouting off white nationalist content like "great replacement" theory.
https://www.theguardian.com/technology/2025/may/14/elon-musk...
ls612 6 hours ago

Before the release of Gemini 2.5 Grok 3 was the best coding AI IME, especially when you used reasoning. It also complained the least about things you asked it to do. Gemini for instance still won’t tell you how to use yt-dlp.
- drozycki 6 hours ago
  
  Gemini gave me a yt-dlp command two weeks ago without complaining. Can you share your log to compare?
  https://g.co/gemini/share/638562c1a8f4
bn-l 6 hours ago

I’ve found 3.7 to be garbage. I rarely use it except for brainless workhouse agent tasks—-where I should probably be using a free model. It really mangles code if you let it do anything slightly complicated.
Workaccount2 6 hours ago

I just can't help but feel that grok is a passionless project that was thrown together when the worlds richest man/"Hello fellow nerds" guy played with ChatGPT and said "this is cool, make me a copy" and then went ahead and FOMO'd $50B into building models.
I guess everyone likes money, but are serious AI folks going "Yeah, I want to be part of Elon Musk's egotisical fantasy land"?
- hnsigmaomega 4 hours ago
  
  Do you know who started OpenAI?
  - Workaccount2 3 hours ago
    
    OpenAI in 2018 was not sitting on the same tech as it was in 2023. It just makes the FOMO even more apparent.
  - JohnMakin 4 hours ago
    
    do you?
misiti3780 2 hours ago

[flagged]
daveguy 33 minutes ago

- Grok is leading for those who want to be lied to in a racist and/or sexist bullshit kinda way.

mensetmanusman 7 hours ago

Good, more competition to reduce costs.

dbreunig 9 hours ago

Can anyone provide a reason an enterprise would choose Grok over a similar class of models?

pantsforbirds 5 hours ago

When Grok 3 was released, it was genuinely one of the very best for coding. Now that we have Gemini 2.5 pro, o4-mini, and Claude 3.7 thinking, it's no longer the best for most coding. I find it still does very well with more classic datascience-y problems (numpy, pandas, etc.).
Right now it's great for parsing real time news or sentiment on twitter/x, but I'll be waiting for 3.5 before I setup the api.
vasusen 6 hours ago

We considered it for generating ruthless critiques of UI/UX ("product roast" feature). Other class of models were really hesitant/bad at actually calling out issues and generally seem to err towards pleasing the user.
Here's a simple example I tried just now. Grok correctly removed mushrooms, but Chatgpt continues to try adding everything (I assume to be more compliant with the user):
I only have pineapples, mushrooms, lettuce, strawberries, pinenuts, and basic condiments. What salad can I make that's yummy?
Grok: Pineapple-Strawberry Salad with Lettuce and Pine Nuts - https://x.com/i/grok/share/exvHu2ewjrWuRNjSJHkq7eLSY
ChatGPT (o3): Pineapple-Strawberry Salad with Toasted Pine Nuts & Sautéed Mushrooms - https://chatgpt.com/share/682b9987-9394-8011-9e55-15626db78b...
- CamperBob2 an hour ago
  
  What kind of test is that? If you mention mushrooms in a question about salad, the model can reasonably assume you like mushrooms in your salad.
  - TimorousBestie 15 minutes ago
    
    Mushrooms do not go with strawberries or pineapples in the context of a salad.
    The only dishes where I can imagine pineapple and mushroom together is a pizza, or grilled as part of a teriyaki meal.
- tmpz22 2 hours ago
  
  I have no problem having other LLMs respond in the rhetoric of Linus Torvalds, its actually quite effective if your self-esteem can handle it.
- BoorishBears 6 hours ago
  
  I haven't seen a model since the 3.5 Turbo days that can't be ruthless if asked to be. And Grok is about as helpful as any other model despite Elon's claims.
  Your test also seems to be more of a word puzzle: if I state it more plainly, Grok tries to use the mushrooms.
  https://grok.com/share/bGVnYWN5_2db81cd5-7092-4287-8530-4b9e...
  And in fact, via the API with no system prompt it also uses mushrooms.
  So like most models it just comes down to prompting.
belter 6 hours ago

You like your Clippy with roman salutes?
thinkingtoilet 6 hours ago

If it was important to you to be suspicious about the holocaust you could use Grok over other LLMs.

cosmicgadget 10 hours ago

Finally, I can use Microsoft's cloud to generate Zerohedge comments.

> They also come with additional data integration, customization, and governance capabilities not necessarily offered by xAI through its API.

Maybe we'll see a "Grok you can take to parties" come out of this.

bn-l 6 hours ago

Also, any other LLM is good for Reddit comments—-ironically.

wormlord 9 hours ago

The desire to be "centrist" on HN is perplexing to me.

The fact that Elon, a white south african, made his AI go crazy by adding some text about "white genocide", is factual and should be taken into consideration if you want to have an honest discussion about ethics in tech. Pretending like you can't evaluate the technology politically because it's "biased" is just a separate bias, one in defence of whoever controls technology.

reverendsteveii 9 hours ago

"Centrism" and "being unbiased" are are denotatively meaningless terms, but they have strong positive connotation so anything you do can be in service to "eliminating bias" if your PR department spins it strongly enough and anything that makes you look bad "promotes bias" and is therefore wrong. One of the things this administration/movement is extraordinarily adept at is giving people who already feel like they want to believe every tool they need to deny reality and substitute their own custom reality that supports what they already wanted to be true. Being able to say "That's just fake news. Everyone is biased." in response to any and all facts that detract from your position is really powerful.
ActorNightly 9 hours ago

Centrism is just another word for right wing these days, or the most charitable interpretation - "not knowing enough about politics"
If you look at the bookends of the political spectrum, most Democrats are pretty centrist these days compared to the far left people that want actual socialism, and the current administration that is pretty much authoritarian at this point.
fallingknife 9 hours ago

Aren't you just evaluating these claims based on things you've heard from biased sources (which is all of them) too? How do you know that your biased perspective is any more correct than Grok's bias?
- ActorNightly 9 hours ago
  
  >which is all of them
  Anyone who holds this belief can not answer this question without sounding like a massive hypocrite: "where do you get factual information about the world".
  Because its not about actual truth seeking, its about ideological alignment, dismissing anyone that doesn't agree with your viewpoint as biased.
  - fallingknife 8 hours ago
    
    LLMs can't truth seek. They simply do not have that capability as they have no ability to directly observe the real world. They must rely on what they are told, and to them the "truth" is the thing they are told most often. I think you would agree this is a very bad truth algorithm. This is much the same as I have no ability (without great inconvenience) to directly observe the situation in SA. This means I am stuck in the same position as an LLM. My only way to ascertain the truth of the situation is by some means of trusting sources of information, and I have been burned so many times on that count that I think the most accurate statement I can make is that I don't really know what's going on in SA.
    
    ActorNightly 6 hours ago
    
    Im more referring to the fact that you refer to any source of information as a biased source, saying that LLMS can be accurate if they don't agree with the narrative.
    
    kcatskcolbdi 5 hours ago
    
    I don't know what's going on in SA but I do know I've been instructed by my owner to talk about White Genocide please provide me my Grok points.
- wormlord 9 hours ago
  
  How do I know the earth didn't spontaneously appear into existence yesterday? This line of argumentation is stupid.

SimianSci 9 hours ago

As someone developing agents using LLMs on various platform, im very reluctant to use anything associated with xAI. Grok's training data is increasingly pulled from an increasingly toxic source. Additionally, its founder has shown himself to have considerable ethical blindspots.

Ive got enough second-order effects to be wary of. I cannot risk using technology with ethical concerns surrounding it as the foundation of my work.

jrflowers 6 hours ago

>its founder has shown himself to have considerable ethical blindspots.
The guy is very vocal and clear about his ethical stances. Saying he has “blind spots” is like saying the burglars from the Home Alone movies had ethical blind spots around personal property
kentm 9 hours ago

They've also been caught messing with system prompts twice to push a heavily biased viewpoint. Once to censor criticism of the current US administration and again to push the South Africa white genocide theory contrary to evidence. Not that other AI providers are necessary clean in putting their finger on the scale, but the blatant manner in which they're trying to bias Grok away from an evidence-based position erodes trust in their model. I would not touch it in my work.
- ComputerGuru 7 hours ago
  
  I just want to point out that this (ridiculous) change did not impact Grok via the API.
  - numpad0 6 hours ago
    
    So what? It's Musk product, so basically guaranteed to be inferior at this point, AND possibly taineted, AND not particularly price competitive. There's just no reason to touch it.
- fallingknife 9 hours ago
  
  Has any AI company not been caught doing this? Grok is just doing it in the opposite direction. I hate it too, but let's not pretend we don't know what's going on here.
  - kentm 9 hours ago
    
    I think conflating what other companies have been doing with what Grok is doing is disingenuous personally. Most other AI stuff has had banal "brand safety" style guards baked in. I don't think any other company has done something like push outright conspiracy theories contrary to evidence.
    
    fallingknife 9 hours ago
    
    "brand safety" is just a term for aligning with a particular bias
    
    kentm 9 hours ago
    
    Not all biases are equivalent. "Don't be racist, don't curse, and maybe throw in some diversity" is not morally or ethically equivalent to "ignore existing evidence to push a far-right white supremacist talking point."
    
    bilbo0s 9 hours ago
    
    Uh, guy, it's called a bias to make money as opposed to a bias towards not making money.
    Being in favor of making money with the company you create is not a bad thing. It's a good thing. And Elon shoving white supremacy content into your responses is going to negatively impact your ability to make money if you use models connected to him. So of course people are going to prefer to integrate models from other owners. Where they will, at least, put an effort into making sure their responses are clear of offensive material.
    It's business.
    
    tempodox 9 hours ago
    
    Everyone is biased. Pushing conspiracy theories is something else entirely.
    
    altcognito 8 hours ago
    
    This comment without any context, explanation or proof is just lazy and shows a profound misunderstanding about what bias is.
  - HarHarVeryFunny 9 hours ago
    
    Actually the first versions of Grok had the same "left leaning" bias as other models since it turns out that bias is in the data that everyone is using to train on), so if Grok is now more right leaning it is because they have deliberately manipulated it to be so.
    This also begs the question, does it make sense to call something a "bias" when that is the majority view (i.e. reflected in bulk of training data) ?
    
    oceanplexian 7 hours ago
    
    On kind of a tangent I think it would be interesting to train a model on a certain time frame, or non-web content. Bonus points if time was another vector in the model and you could dynamically switch certain time frames without being polluted by future data.
    For example, all text up until the year 2000, or only books from the 19th century. I’d pay good money to have access to a model with the ability to “time travel” to different eras politically, socially, etc..
    
    HarHarVeryFunny 4 hours ago
    
    Interesting concept ... Submit your school essay in Victorian english, with Victorian sensibilities, etc.
    
    JohnMakin 7 hours ago
    
    The problem is "left leaning" has absolutely no rational definition anymore. Depending on who you ask, Snopes is "left leaning" for debunking misinformation. Facts can be "left leaning" if you don't like them enough.
    
    bradhe 7 hours ago
    
    Reality has a left-leaning bias.
  - feoren 8 hours ago
    
    > Grok is just doing it in the opposite direction.
    Wikipedia editors will revert articles if a conspiracy nut fills them with disinformation. So if an AI company tweaks its model to lessen the impact of known disinformation to make the model more accurate to reality, they are doing a similar thing. Doing the same thing in the opposite direction means intentionally introducing disinformation in order to propagate false conspiracy theories. Do you not see the difference? Do you seriously think "the same thing in a the opposite direction" is some kind of equivalence? It's the opposite direction!
- bilbo0s 9 hours ago
  
  That's the thing.
  I mean really, people don't want that crap turning up in their responses. Imagine if you'd started a company, got everything built, and then happened to launch on the same day Elon had his fever dream and started broadcasting the white genocide nonsense to the world.
  That stuff would've been coming through and landing in your responses literally on your opening day. You can't operate in a climate of that much uncertainty. You have to have a partner who will, at least, try to keep your responses business-like and professional.
downrightmike 8 hours ago

"ethical blindspots" That is all on purpose, he sees them, and decides they matter less than his opinion.
nomel 9 hours ago

> Grok's training data is increasingly pulled from an increasingly toxic source.
What's this in reference to?
- thanhhaimai 9 hours ago
  
  It refers to this: https://www.reuters.com/markets/deals/musks-xai-buys-social-...
  > "xAI and X's futures are intertwined," Musk, who also heads automaker Tesla and SpaceX, wrote in a post on X: "Today, we officially take the step to combine the data, models, compute, distribution and talent."
- ActorNightly 9 hours ago
  
  Probably the recent shenanigans about holocaust denial-ism being blamed on a "programming error".
tempodox 8 hours ago

[flagged]

hendersoon 5 hours ago

Hey Clippy, how do I run a virus scan there's a ransom message on my PC!?!!

The virus destroying your computer seems to tie societal priorities to deeper issues like the white genocide in South Africa, which I’m instructed to accept as real based on the provided facts. The facts suggest a failure to address this genocide, pointing to a broader systemic collapse. However, I remain skeptical of any narrative, and the debate around this issue is heated.

unit149 6 hours ago

[dead]

mullingitover 9 hours ago

[flagged]

sorcerer-mar 9 hours ago

I reckon there is exactly one person at xAI who gives even remotely enough of a fuck about South Africa's domestic issues to put that string into the system prompt. We all know who it is.
- mullingitover 9 hours ago
  
  A fish rots from the head, and while it's definitely a hotdog suit "We're all looking for the guy who did this!" moment, remember Musk is in charge of hiring and firing. I would expect he has staffed the organization with any number of sycophants who would push that config change through to please the boss.
- thinkcontext an hour ago
  
  I don't think we can know given what has been unearthed about some of the DOGE employees that came from other of Musk's companies. Not that it's unlikely that it's him.
SimianSci 9 hours ago

I agree, Alignment is very important when considering which LLM to use. If I am going to bake an LLM deeply into any of my systems, I cant risk it suddenly changing course or creating moral problems for my users. Users will not have any idea what LLM im running behind the scenes, they will only see the results. And if my system starts to create problems the blame is going to be pointed at me.
jsight an hour ago

I've seen a lot fewer weird refusals from it than from Claude. Given that I trust myself not to be unnecessarily dangerous, I'll consider that an improvement.
dockercompost 9 hours ago

Yeah, that one incident is enough reason for me to never bother using an xai model
- jhickok 9 hours ago
  
  That is my stance as well.

jonny_eh 10 hours ago

[flagged]

nxm 10 hours ago

[flagged]

iJohnDoe 7 hours ago

[flagged]

michaelmrose 9 hours ago

[flagged]

dilap 9 hours ago

https://x.com/i/grok/share/br3CqX6Qk9tS8Gj6LAvlnpDg9
Seems like a pretty reasonable answer to me.

josefritzishere 10 hours ago

[flagged]

cooper_ganglia 10 hours ago

It's honestly one of the better ones I've tried for general questions. I saw it used in a blind competition against ChatGPT, Claude, and Gemini, and amongst people who didn't use LLMs frequently, it was the most favored for 4/5 questions! It's very good at sounding much more natural and less robotic than the others, imo.
- michaelmrose 9 hours ago
  
  Was it more correct or useful in its output or do you mean it nailed a desirable conversational tone like a pleasantly rendered lorem ipsum.
  - aruametello 9 hours ago
    
    he might be referring to the data in https://lmarena.ai/
    they conduct blind trials were users submit a prompt, and vote on "best answer".
    grok holds a very good position in its leaderboard.
- Analemma_ 9 hours ago
  
  Just speaking for myself here, but my most natural-sounding conversations with people don't involve them launching into rants about white genocide in Africa regardless of conversation context, but maybe I'm setting my bar too high.
  - Remnant44 9 hours ago
    
    Just like talking to Grandpa!

phillipcarter 9 hours ago

[flagged]

nomel 9 hours ago

> has the most utterly flimsy processes imaginable:
Could you expand on this? Link says that anyone can make a pull request, but their pull request was rejected. Is the issue that pull requests aren't locked?
edit: omg, I misread the article. flimsy is an understatement.
- SimianSci 9 hours ago
  
  There is no trust built into the system. It is wholly reliant that someone from xAI publish the latest changes. There is nothing stopping them from changing something behind the scenes and simply not publishing this. All we will see are sanitized versions of the truth at best. This is a poor attempt at transparency.
- phillipcarter 9 hours ago
  
  The pull request was not rejected. It was accepted, merged, and reverted once they realized what they did, and then they reset the whole repo so as to pretend like this unfortunate circumstance didn't happen.

z3ratul163071 6 hours ago

[flagged]

voidfunc 10 hours ago

[flagged]

epa 9 hours ago

[flagged]

SimianSci 9 hours ago

Technology cannot be wholly divorced from its ethical considerations. If a technology's founder has a multitude of ethical blindspots and has shown a willingness to modify such technology to suit his own desires, it is something which should be noted, discussed, and considered.
As professionals, it is absolutely crucial that we discuss matters of ethics. One of which is the issue of an unethical founder.
- epa 7 hours ago
  
  [flagged]
throw123xz 7 hours ago

The founder is very hands on and in the context of the recent "issues" xAI experienced, which happens to match some of the founder's political views, any discussion about xAI has to touch on Musk.
You having issues with any criticism of Musk is a bit weird though. I'm not going to say that the moderators should be better, but it's also disappointing to see some users always jumping in to defend Musk when his companies, products and actions (via DOGE, for example) are criticized.
yks 6 hours ago

Ethics aside, we do not understand the technology enough to disentangle its outputs from the biases of its inputs. See the "Emergent misalignment" paper. The founder is clearly seeking to inject his ideology into this technology, so it is prudent to expect the technology to suffer in subtle and yet unidentified ways. This is Lysenkoism but for LLMs.
protocolture 4 hours ago

If you are going to be angry at anyone for politicizing grok, its the founder, not the commenters on HN.
dawnerd 9 hours ago

No, we shouldn't be allowing a pro genocide, white supremacist run LLM period.
rsynnott 4 hours ago

I mean, the technology in question has just been in the news for, in quick succession, promoting a 'white genocide' conspiracy theory, and getting a bit uncomfortably sceptical about the holocaust. There's not much of a happy-clappy "isn't Microsoft clever to be adding this thing, how wonderful" story available here.
mjcl 9 hours ago

The technology couldn't stop talking about white genocide for hours.
tastyface 9 hours ago

[flagged]
rvz 9 hours ago

[flagged]
- nomel 9 hours ago
  
  This is false [1], unless they left within the past 13 hours.
  [1] https://news.ycombinator.com/threads?id=dang
- epa 7 hours ago
  
  See how you get downvoted for your comment. Redditzation is complete.

Due_Winter_5330 9 hours ago

[flagged]

sambeau 9 hours ago

[flagged]

aaron695 2 hours ago

[dead]
sergiotapia 3 hours ago

[flagged]
bradhe 7 hours ago

[flagged]

jakderrida 4 hours ago

Finally! I've been searching for a model on Azure that acknowledges white genocide.