
'Tis the week for small AI models, it seems. Nonprofit AI research institute Ai2 on Thursday released Olmo 2 1B, a 1-billion-parameter model that Ai2

Ai2's new small AI model outperforms similarly-sized models from Google, Meta
'Tis the week for small AI models, it seems. Nonprofit AI research institute Ai2 on Thursday released Olmo 2 1B, a 1-billion-parameter model that Ai2
'On Thursday, Ai2, the nonprofit AI research institute, released Olmo 2 1B, a 1-billion-parameter model that Ai2 claims beats similarly-sized models from Google, Meta, and Alibaba on several benchmarks. Parameters, sometimes referred to as weights, are the internal components of a model that guide its behavior.
Olmo 2 1B is available under a permissive Apache 2.0 license on the AI dev platform Hugging Face. Unlike most models, Olmo 2 1B can be replicated from scratch; Ai2 has provided the code and data sets (Olmo-mix-1124, Dolmino-mix-1124) used to develop it.
Alibaba unveils Qwen 3, a family of 'hybrid' AI reasoning models
Chinese tech company Alibaba released Qwen 3, a family of AI models that the company claims outperforms some of the best.
Chinese tech company Alibaba on Monday released Qwen 3, a family of AI models the company claims matches and in some cases outperforms the best models available from Google and OpenAI.
Future AI might not need supercomputers thanks to models like BitNet b1.58 2B4T.
OpenAI's o3 AI model scores lower on a benchmark than the company initially implied
A discrepancy between first- and third-party benchmark results for OpenAI's o3 AI model is raising questions about the company's transparency and model testing practices.
The dataset is even pre-formatted for machine learning.
Nova Sonic can detect your tone.
WordPress.com launches a free AI-powered website builder
WordPress.com has launched a new AI site builder that allows anyone to create a functioning website using an AI chat-style interface.
WordPress.com has launched a new AI site builder that allows anyone to create a functioning website using an AI chat-style interface.
The new tariff plan is confusing ā and the tech industry is scrambling to make sense of it.
Leaked data exposes a Chinese AI censorship machine
One academic who reviewed the dataset said it was "clear evidence" that China, or its affiliates, wants to use AI to improve repression.
cross-posted from: https://lemmy.sdf.org/post/31892983
TLDR:
- China has developed an Artificial Intelligence (AI) system that adds to its already powerful censorship machine, scanning content for all kinds of topics like corruption, military issues, Taiwan politics, satire
- The discovery was accidental, security researchers found an Elasticsearch database unsecured on the web, hosted by Chinese company Baidu
- Experts highlight that AI-driven censorship is evolving to make state control over public discourse even more sophisticated, especially after recent releases like China's AI model DeepSeek
A complaint about poverty in rural China. A news report about a corrupt Communist Party member. A cry for help about corrupt cops shaking down entrepreneurs.
These are just a few of the 133,000 examples fed into a sophisticat
This blog-post compares the coding capabilites of new Gemini 2.5 Pro experimental and Claude 3.7 Sonnet (thinking)
This blog-post compares the coding capabilites of new Gemini 2.5 Pro experimental and Claude 3.7 Sonnet (thinking)
Is Cursor done?
With Cline + Gemini 2.5 Pro, one can get the exact same feature set that Cursor and Windsurf provide. They only call the APIs of the big LLM Providers without an advanced secret sauce.
Itās even the opposite - they worsen model performance by limiting context size. The key advantage, the fixed monthly costs instead of variable API usage, is now gone with Gemini 2.5 Proā¦
What is left that justifies their ridiculous valuation atm?
The country poured billions into AI infrastructure, but the data center gold rush is unraveling as speculative investments collide with weak demand and DeepSeek shifts AI trends.
We surveyed 730 coders and developers about how (and how often) they use AI chatbots on the job. The results amazed and disturbed us.
The main point is that, for a free society, digital literacy matters more than ever.
The evidence-backed model delivered impressive results, but it doesnāt validate the wave of AI therapy bots flooding the market.
Gemini 2.5: Our most intelligent AI model
Gemini 2.5 is our most intelligent AI model, now with thinking.
Gemini 2.5 is our most intelligent AI model, now with thinking.
DeepSeek AI model can easily be breached for malware, security researcher Tenable warns
Tenable Research examines DeepSeek R1 and its capability to develop malware, such as a keylogger and ransomware. We found it provides a useful starting point, but requires additional prompting and debugging.
cross-posted from: https://lemmy.sdf.org/post/31583546
Security researcher Tenable successfully used DeepSeek to create a keylogger that could hide an encrypted log file on disk as well as develop a simple ransomware executable.
At its core, DeepSeek can create the basic structure for malware. However, it is not capable of doing so without additional prompt engineering as well as manual code editing for more advanced features. For instance, DeepSeek struggled with implementing process hiding. "We got the DLL injection code it had generated working, but it required lots of manual intervention," Tenable writes in its report.
**"Nonetheless, DeepSeek provides a useful compilation of techniques and search terms that can help someone with no prior experience in writing malicious code the ability to quickly famil
Trust Report DeepSeek R1: "Critical levels of risk with security and ethics, high levels of risk with privacy, stereotype, toxicity, hallucination, and fairness"
Discover the VIJIL Trust Report for DeepSeek R1, a comprehensive evaluation of security, ethics, privacy, hallucination, and performance risks in this large language model (LLM). Our analysis identifies critical security and ethical risks, high privacy vulnerabilities, and moderate hallucination ris...
cross-posted from: https://lemmy.sdf.org/post/31552333
A Trust Report for DeepSeek R1 by VIJIL, a security resercher company, indicates critical levels of risk with security and ethics, high levels of risk with privacy, stereotype, toxicity, hallucination, and fairness, a moderate level of risk with performance, and a low level of risk with robustness.
IBM's CEO doesn't think AI will replace programmers anytime soon
IBM CEO Arvind Krishna says that, despite the Trump administration's attacks on globalism, global trade isn't dead. In fact, he thinks that the U.S.'s key