<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Simone Bellavia's Web Page</title><link>https://sibellavia.lol/</link><description>Recent content on Simone Bellavia's Web Page</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Tue, 30 Sep 2025 14:57:00 +0200</lastBuildDate><atom:link href="https://sibellavia.lol/index.xml" rel="self" type="application/rss+xml"/><item><title>About DeepSeek Sparse Attention</title><link>https://sibellavia.lol/notes/2025/09/30/about-deepseek-sparse-attention/</link><pubDate>Tue, 30 Sep 2025 14:57:00 +0200</pubDate><guid>https://sibellavia.lol/notes/2025/09/30/about-deepseek-sparse-attention/</guid><description>&lt;p&gt;&lt;a href="https://github.com/deepseek-ai/DeepSeek-V3.2-Exp/blob/main/DeepSeek_V3_2.pdf"&gt;DeepSeek Sparse Attention&lt;/a&gt;, some considerations about it after reading the paper.&lt;/p&gt;
&lt;p&gt;&lt;a href="https://sibellavia.lol/notes/2025/09/30/deepseek-v3.2-exp-claude-sonnet-4.5-and-more/"&gt;DeepSeek-V3.2-Exp&lt;/a&gt; is defined as an &lt;em&gt;experimental sparse-attention model&lt;/em&gt;. Its architecture is the same as DeepSeek-V3.1-Terminus, except for the introduction of DeepSeek Sparse Attention (DSA). The core idea is to reduce the number of key-value pairs each query token looks at, instead of attending to all tokens. Sparse attention only computes a subset of entries, making long-context reasoning more feasible.&lt;/p&gt;
&lt;p&gt;Most sparse methods fix a pattern, but DSA is dynamic. It is composed by (i) a lightning indexer that computes a lightweight index score between query and candidate tokens, and selects the top-k most relevant tokens per query; (ii) a fine-grained token seelction mechanism that retrieves only the key-value entries corresponding to the top-k index score. From what I can understand, each query chooses its own set of tokens. The main advantage is that it&amp;rsquo;s more adaptive, because it&amp;rsquo;s based on a query-specific selection rather than fixed pattern. But this introduces the need of an extra module (the indexer) and its own training. Also, some performance drops in reasoning-heavy tasks if too few tokens are selected, but this small aspect is negligible in the whole context. I think that this suggests that DSA prunes aggressively but sometimes removes &amp;ldquo;useful but not obviously important&amp;rdquo; context tokens.&lt;/p&gt;
&lt;p&gt;More about the lightning indexer: it is implemented with few heads and even in FP8. I think it&amp;rsquo;s a smart design choice. It&amp;rsquo;s lightweight enough not to negate efficiency gains, and it offers adaptive sparsity.&lt;/p&gt;
&lt;p&gt;My takeaway for now is that hybridization is the practical path, in my opinion it is unavoidable. We&amp;rsquo;ve seen Qwen-Next models using hybrid layers, and I really like that solution, the models perform pretty well. Nevertheless, I like DSA and I think it&amp;rsquo;s compelling because it can be introduced via continued training, meaning the model doesn&amp;rsquo;t lose what it already learned under dense attention.&lt;/p&gt;</description></item><item><title>Qwen3-Max and other Qwen releases</title><link>https://sibellavia.lol/notes/2025/09/24/qwen3-max-and-other-qwen-releases/</link><pubDate>Wed, 24 Sep 2025 09:28:00 +0200</pubDate><guid>https://sibellavia.lol/notes/2025/09/24/qwen3-max-and-other-qwen-releases/</guid><description>&lt;p&gt;&lt;a href="https://qwen.ai/blog?id=241398b9cd6353de490b0f82806c7848c5d2777d&amp;amp;from=research.latest-advancements-list"&gt;Qwen3-Max&lt;/a&gt; has been released from Qwen team. It&amp;rsquo;s their largest and most advanced large language model to date. It competes against GPT-5 and Grok 4.&lt;/p&gt;
&lt;p&gt;The base model has &lt;em&gt;over 1 trillion parameters and was pretrained on 36 trillion tokens&lt;/em&gt;. Its architecture seems to follow the same of other models from Qwen3 series: it provides a highly optimized MoE design, which activates only a subset of parameters per inference. This is something we&amp;rsquo;ve already seen with Qwen3-Next models, form which I think it inherits the same context window also.&lt;/p&gt;
&lt;p&gt;The thinking variant, Qwen3-Max-Thinking, it is equipped with tool use and they say it&amp;rsquo;s deployed &lt;em&gt;in heavy mode&lt;/em&gt;. It&amp;rsquo;s unclear to me what do they mean with it: perhaps they give it way more computational resources compared to the non-thinking variant.&lt;/p&gt;
&lt;p&gt;They are taking the core architecture and maxxioptimizing it to reduce costs and improve efficiency. It&amp;rsquo;s impressive to me.&lt;/p&gt;
&lt;p&gt;In the last 12 hours, Qwen has released:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Qwen3-Max&lt;/li&gt;
&lt;li&gt;Qwen3-VL-235B-A22B: most powerful vision-language model in the series&lt;/li&gt;
&lt;li&gt;Upgrade to Qwen3-Coder: improved terminal tasks, safer code gen&lt;/li&gt;
&lt;li&gt;Qwen3Guard: safety moderation series for real-time AI content filtering&lt;/li&gt;
&lt;li&gt;Personal AI Travel Designer: new feature in Qwen Chat for personalized trip planning&lt;/li&gt;
&lt;li&gt;Qwen3-LiveTranslate-Flash: low-latency live translation model for real-time audio/text&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;While Qwen is continuing to optimize and release new models, I&amp;rsquo;ll wait for DeepSeek. I&amp;rsquo;m convinced they are cooking.&lt;/p&gt;</description></item><item><title>Go has added Valgrind support</title><link>https://sibellavia.lol/notes/2025/09/23/go-has-added-valgrind-support/</link><pubDate>Tue, 23 Sep 2025 15:28:00 +0200</pubDate><guid>https://sibellavia.lol/notes/2025/09/23/go-has-added-valgrind-support/</guid><description>&lt;p&gt;&lt;a href="https://go-review.googlesource.com/c/go/+/674077"&gt;Go has added Valgrind support.&lt;/a&gt; While reading the commit, I saw this:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Instead of adding the Valgrind headers to the tree, and using cgo to
call the various Valgrind client request macros, we just add an assembly
function which emits the necessary instructions to trigger client
requests.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;This is super interesting. Let&amp;rsquo;s have a quick look at the code:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre tabindex="0" class="chroma"&gt;&lt;code class="language-go" data-lang="go"&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// Copyright 2025 The Go Authors. All rights reserved.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// Use of this source code is governed by a BSD-style&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// license that can be found in the LICENSE file.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="cp"&gt;//go:build valgrind &amp;amp;&amp;amp; linux&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="err"&gt;#&lt;/span&gt;&lt;span class="nx"&gt;include&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s"&gt;&amp;#34;textflag.h&amp;#34;&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// Instead of using cgo and using the Valgrind macros, we just emit the special client request&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// assembly ourselves. The client request mechanism is basically the same across all architectures,&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// with the notable difference being the special preamble that lets Valgrind know we want to do&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// a client request.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;//
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// The form of the VALGRIND_DO_CLIENT_REQUEST macro assembly can be found in the valgrind/valgrind.h&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// header file [0].&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;//
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// [0] https://sourceware.org/git/?p=valgrind.git;a=blob;f=include/valgrind.h.in;h=f1710924aa7372e7b7e2abfbf7366a2286e33d2d;hb=HEAD&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="c1"&gt;// func valgrindClientRequest(uintptr, uintptr, uintptr, uintptr, uintptr, uintptr) (ret uintptr)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="nx"&gt;TEXT&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;runtime&lt;/span&gt;&lt;span class="err"&gt;·&lt;/span&gt;&lt;span class="nf"&gt;valgrindClientRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;SB&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;NOSPLIT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;56&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// Load the address of the first of the (contiguous) arguments into AX.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;LEAQ&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;args&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;FP&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;AX&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// Zero DX, since some requests may not populate it.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;XORL&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;DX&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;DX&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// Emit the special preabmle.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;ROLQ&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;DI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;ROLQ&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;DI&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;ROLQ&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;61&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;DI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;ROLQ&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;51&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;DI&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// &amp;#34;Execute&amp;#34; the client request.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;XCHGQ&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;BX&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;BX&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;// Copy the result out of DX.&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;MOVQ&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;DX&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;ret&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;48&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;FP&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="line"&gt;&lt;span class="cl"&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nx"&gt;RET&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;This is the amd64 assembly for the Valgrind client request. This asm emits the exact instruction sequence that Valgrind&amp;rsquo;s macro &lt;code&gt;VALGRIND_DO_CLIENT_REQUEST&lt;/code&gt; would have produced in C, just without cgo.&lt;/p&gt;
&lt;p&gt;On arm64, the same idea is implemented with different registers and the AArch64 &amp;ldquo;marker&amp;rdquo; Valgrind looks for.&lt;/p&gt;
&lt;p&gt;It&amp;rsquo;s nice because they do everything on the language itself, even when relying on assembly. Some reasons I could imagine they do it this way: to avoid cgo and keep the runtime pure-Go, but most importantly control.&lt;/p&gt;
&lt;p&gt;Really interesting for me that Go team decided to follow this route. Also, I&amp;rsquo;m not a fan of cgo.&lt;/p&gt;</description></item><item><title>GPT-5-Codex and Codex improvements</title><link>https://sibellavia.lol/notes/2025/09/15/gpt-5-codex-and-codex-improvements/</link><pubDate>Mon, 15 Sep 2025 23:30:00 +0200</pubDate><guid>https://sibellavia.lol/notes/2025/09/15/gpt-5-codex-and-codex-improvements/</guid><description>&lt;p&gt;New model in town! &lt;a href="https://openai.com/index/introducing-upgrades-to-codex/"&gt;GPT-5-Codex&lt;/a&gt; is a version of GPT-5 specifically realized for agentic coding in Codex. Here&amp;rsquo;s what you need to know:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It dynamically adapts its &lt;em&gt;thinking&lt;/em&gt; based on the complexity of the task.&lt;/li&gt;
&lt;li&gt;Adheres to &lt;a href="https://agents.md/"&gt;AGENTS.md&lt;/a&gt; instructions.&lt;/li&gt;
&lt;li&gt;It has been trained specifically for conducting code reviews and finding critical flaws.&lt;/li&gt;
&lt;li&gt;GPT-5 and GPT-5-Codex achieve comparable accuracy on SWE-bench Verified (72.8% vs. 74.5%), but GPT-5-Codex shows a clear advantage in code refactoring tasks (51.3% vs. 33.9%).&lt;/li&gt;
&lt;li&gt;OpenAI found that comments by GPT‑5-Codex are less likely to be incorrect or unimportant: GPT-5-Codex produces fewer incorrect comments (4.4% vs. 13.7%) and more high-impact comments (52.4% vs. 39.4%) than GPT-5. Interestingly, GPT-5 makes more comments per pull request on average (1.32 vs. 0.93), but with lower precision and impact.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Many are complaining about the naming and the &amp;ldquo;Codex everywhere&amp;rdquo;. Honestly, I don&amp;rsquo;t care so much about the poor naming scheme as long as models and tools are good.&lt;/p&gt;
&lt;p&gt;GPT-5-Codex is not available in the API but it will be soon. To use it, you will need Codex CLI, so make sure to install it: &lt;code&gt;npm i -g @openai/codex&lt;/code&gt;. &lt;a href="https://x.com/sama/status/1967674950502015165"&gt;@sama&lt;/a&gt; claims that GPT-5-Codex already represents ~40% of traffic for Codex.&lt;/p&gt;
&lt;p&gt;I installed and tried it (yes, haven&amp;rsquo;t done before, this is the first time for me using Codex). You can choose the model reasoning effort: prompting &lt;code&gt;/model&lt;/code&gt;, Codex will let you choose between &lt;code&gt;gpt-5-codex low&lt;/code&gt;, &lt;code&gt;gpt-5-codex medium&lt;/code&gt; and &lt;code&gt;gpt-5-codex high&lt;/code&gt;. Although &lt;a href="https://x.com/embirico/status/1967655551762075861"&gt;OpenAI recommends to leave the model_reasoning_effort at default (medium)&lt;/a&gt; to take the most advantage of the more dynamic reasoning effort.&lt;/p&gt;
&lt;p&gt;Along with the model, they also provided more updates:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Codex runs in a sandboxed environment with network access (opens in a new window) disabled, whether locally or in the cloud.&lt;/li&gt;
&lt;li&gt;In Codex CLI, you can now resume a previous interactive session.&lt;/li&gt;
&lt;li&gt;Once turned on for a GitHub repo, Codex automatically reviews PRs.&lt;/li&gt;
&lt;li&gt;It is possible to asynchronously delegate tasks to Codex Cloud.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;And more.&lt;/p&gt;
&lt;p&gt;I think they&amp;rsquo;re heading in the right direction, actually. They&amp;rsquo;re focusing their efforts on the tools, which is good. What&amp;rsquo;s more, I have to say that I&amp;rsquo;ve reevaluated GPT5 and am using it daily instead of Claude. That&amp;rsquo;s why I appreciate and welcome these new releases.&lt;/p&gt;
&lt;p&gt;Last but not least, &lt;a href="https://github.com/openai/codex"&gt;Codex is open-source!&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Safe C++ proposal is not being continued</title><link>https://sibellavia.lol/posts/2025/09/safe-c-proposal-is-not-being-continued/</link><pubDate>Sat, 13 Sep 2025 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/posts/2025/09/safe-c-proposal-is-not-being-continued/</guid><description>&lt;p&gt;One year ago, the &lt;a href="https://safecpp.org/draft.html"&gt;Safe C++ proposal&lt;/a&gt; was made. The goal was to add a safe subset/context into C++ that would give strong guarantees (memory safety, type safety, thread safety) similar to what Rust provides, without breaking existing C++ code. It was an extension or superset of C++. The opt-in mechanism was to explicitly mark parts of the code that belong to the safe context. The authors even state:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Code in the safe context exhibits the same strong safety guarantees as code written in Rust.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;The rest remains &amp;ldquo;unsafe&amp;rdquo; in the usual C++ sense. This means that existing code continues to work, while new or refactored parts can gain safety. For those who write Rust, Safe C++ has many similarities with Rust, sometimes with adjustments to fit C++&amp;rsquo;s design. Also, because C++ already has a huge base of &amp;ldquo;unsafe code&amp;rdquo;, Safe C++ has to provide mechanisms for mixing safe and unsafe, and for incremental migration. In that sense, all of Safe C++&amp;rsquo;s safe features are opt-in. Existing code compiles and works as before. Introducing safe context doesn’t break code that doesn’t use it.&lt;/p&gt;
&lt;p&gt;The proposal caught my interest. It seemed like a good compromise to make C++ safe, although there were open or unresolved issues, which is completely normal for a draft proposal. For example, how error reporting for the borrow checker and lifetime errors would work, or how generic code and templates would interact with lifetime logic and safe/unsafe qualifiers. These are just some of the points, the proposal is very long and elaborate. Moreover, I am not a programming language designer, so there might be better alternatives.&lt;/p&gt;
&lt;p&gt;Anyway, today I discovered that the proposal will no longer be pursued. When I thought about the proposal again this morning, I realized I hadn’t read any updates on it for some time. So I searched and found some answers on &lt;a href="https://www.reddit.com/r/cpp/comments/1lhbqua/any_news_on_safe_c/"&gt;Reddit&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The response from Sean Baxter, one of the original authors of the Safe C++ proposal:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Safety and Security working group voted to prioririze Profiles over Safe C++. Ask the Profiles people for an update. Safe C++ is not being continued.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;And again:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The Rust safety model is unpopular with the committee. Further work on my end won&amp;rsquo;t change that. Profiles won the argument. All effort should go into getting Profile&amp;rsquo;s language for eliminating use-after-free bugs, data races, deadlocks and resource leaks into the Standard, so that developers can benefit from it.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;So I went to read the documents related to Profiles[1][2][3][4]. I try to summarize what I understood: they are meant to define modes of C++ that impose constraints on how you use the language and library, in order to guarantee certain safety properties. They are primarily compile-time constraints, though in practice some checks may be implemented using library facilities that add limited runtime overhead. Instead of introducing entirely new language constructs, profiles mostly restrict existing features and usages. The idea is that you can enable a profile, and any code using it agrees to follow the restrictions. If you don’t enable it, things work as before. So it&amp;rsquo;s backwards-compatible.&lt;/p&gt;
&lt;p&gt;Profiles seem less radical and more adoptable, a safer-by-default C++ without forcing the Rust model that aims to tackle the most common C++ pitfalls. I think Safe C++ was more ambitious: introducing new syntax, type qualifiers, safe vs unsafe contexts, etc. Some in the committee felt that was too heavy, and Profiles are seen as a more pragmatic path. The main objection is obvious: one could say that Profiles restrict less than what Safe C++ aimed to provide.&lt;/p&gt;
&lt;p&gt;Reading comments here and there, there is visible resistance in the community toward adopting the Rust model, and from a certain point of view, I understand it. If you want to write like Rust, just write Rust. Historically, C++ is a language that has often taken features from other worlds and integrated them into itself. In this case, I think that safety subsets of C++ already exist informally somehow. Profiles are an attempt to standardize and unify something that already exists in practice. Technically, they don’t add new fundamental semantics. Instead, they provide constraints, obligations and guarantees.&lt;/p&gt;
&lt;p&gt;In my opinion, considering the preferences of the committee and the entire C++ community, although I appreciated the Safe C++ proposal and was looking forward to seeing concrete results, considering the C++ context I believe that standardizing and integrating the Profiles as proposed is a much more realistic approach. Profiles might not be perfect, but they are better than nothing. They will likely be uneven in enforcement and weaker than Safe C++ in principle. They won&amp;rsquo;t give us silver-bullet guarantees, but they are a realistic path forward.&lt;/p&gt;
&lt;p&gt;[1] &lt;a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3081r1.pdf"&gt;Core safety profiles for C++26&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[2] &lt;a href="https://open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3589r0.pdf"&gt;C++ Profiles: The Framework&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[3] &lt;a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3704r0.pdf"&gt;What are profiles?&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[4] &lt;a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3651r0.pdf"&gt;Note to the C++ standards committee members&lt;/a&gt;&lt;/p&gt;
&lt;hr&gt;
&lt;p&gt;Join the conversation on &lt;a href="https://news.ycombinator.com/item?id=45234460"&gt;Hacker News&lt;/a&gt; and &lt;a href="https://www.reddit.com/r/cpp/comments/1ngjemb/safe_c_proposal_is_not_being_continued/"&gt;Reddit&lt;/a&gt;.&lt;/p&gt;</description></item><item><title>Qwen 3 Next</title><link>https://sibellavia.lol/notes/2025/09/12/qwen-3-next/</link><pubDate>Fri, 12 Sep 2025 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/notes/2025/09/12/qwen-3-next/</guid><description>&lt;p&gt;&lt;a href="https://x.com/Alibaba_Qwen/status/1966197643904000262"&gt;Qwen team released two new models&lt;/a&gt;: Qwen3-Next-80B-A3B-Instruct and Qwen3-Next-80B-A3B-Thinking.
Both are already present on &lt;a href="https://huggingface.co/collections/Qwen/qwen3-next-68c25fd6838e585db8eeea9d"&gt;HuggingFace&lt;/a&gt;. Qwen also published &lt;a href="https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&amp;amp;from=research.latest-advancements-list"&gt;a post on their blog&lt;/a&gt;.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;Compared to the MoE structure of Qwen3, Qwen3-Next introduces several key improvements: a hybrid attention mechanism, a highly sparse Mixture-of-Experts (MoE) structure, training-stability-friendly optimizations, and a multi-token prediction mechanism for faster inference.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Both models are based on the Qwen3-Next-80B-A3B-Base model, which only activates 3 billion parameters per token. Qwen 3 Next is an ultra-sparse MoE with 512 experts, combining 10 routed experts and 1 shared experts. Also, it&amp;rsquo;s based on a hybrid architecture, composed by &lt;em&gt;Gated DeltaNet + Gated Attention&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;They say Qwen3-Next-80B-A3B-Instruct approaches their 235B flagship, and Qwen3-Next-80B-A3B-Thinking seems to outperform Gemini-2.5-Flash-Thinking.&lt;/p&gt;
&lt;p&gt;Qwen 3 Next natively supports context lengths of up to 262,144 tokens, but they even validated it on context lengths of up to 1 million tokens using the YaRN method. YaRN is supported by &lt;code&gt;transformers&lt;/code&gt;, &lt;code&gt;vllm&lt;/code&gt; and &lt;code&gt;sglang&lt;/code&gt;.&lt;/p&gt;</description></item><item><title>iPhone Air</title><link>https://sibellavia.lol/notes/2025/09/10/iphone-air/</link><pubDate>Wed, 10 Sep 2025 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/notes/2025/09/10/iphone-air/</guid><description>&lt;p&gt;Apple presented the iPhone Air, &lt;em&gt;the thinnest iPhone ever&lt;/em&gt;. This is the only new release from Apple that got my interest during their presentation event.&lt;/p&gt;
&lt;p&gt;Its design is interesting: the entire logic board and A19 Pro chip are compacted into the camera bump (which includes both front and rear cameras). This iPhone is all battery and screen. IMHO, it seems like a strategic move for the coming years, for which this iPhone Air will serve as an experiment or a launchpad for ultra-thin devices, or simply as a research and development testbed for similar designs that enable powerful yet ultra-compact technologies.&lt;/p&gt;
&lt;p&gt;Remarkable factor, iPhone Air has A19 Pro, which is Apple&amp;rsquo;s latest SoC. More in detail: it is built on TSMC&amp;rsquo;s N3P process node, and benefits &lt;em&gt;from a 20% increase in transistor density compared to its predecessor, the N3E node, according to a 2023 IEEE study on semiconductor scaling&lt;/em&gt;. The A19 Pro features a six-core CPU with two high-performance cores and four efficiency cores, and 5-core GPU. Each GPU core has its own Neural Accelerators, which Apple claimed allows for MacBook Pro-level performance in an iPhone. On the new iPhone Pro, they are even more powerful. If the M5 chip will get this GPU upgrade&amp;hellip; well, NVIDIA should start to feel some pressure.&lt;/p&gt;
&lt;p&gt;To summarize: local AI to the Max. Next year, I want local LLMs on my phone.&lt;/p&gt;</description></item><item><title>npm debug and chalk packages compromised</title><link>https://sibellavia.lol/notes/2025/09/09/npm-debug-and-chalk-packages-compromised/</link><pubDate>Tue, 09 Sep 2025 22:10:00 +0200</pubDate><guid>https://sibellavia.lol/notes/2025/09/09/npm-debug-and-chalk-packages-compromised/</guid><description>&lt;p&gt;Yesterday, a lot of npm packages have been compromised with malicious code. Following, a list of affected packages:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="mailto:ansi-styles@6.2.2"&gt;ansi-styles@6.2.2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:debug@4.4.2"&gt;debug@4.4.2&lt;/a&gt; (appears to have been yanked as of 8 Sep 18:09 CEST)&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:chalk@5.6.1"&gt;chalk@5.6.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:supports-color@10.2.1"&gt;supports-color@10.2.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:strip-ansi@7.1.1"&gt;strip-ansi@7.1.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:ansi-regex@6.2.1"&gt;ansi-regex@6.2.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:wrap-ansi@9.0.1"&gt;wrap-ansi@9.0.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:color-convert@3.1.1"&gt;color-convert@3.1.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:color-name@2.0.1"&gt;color-name@2.0.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:is-arrayish@0.3.3"&gt;is-arrayish@0.3.3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:slice-ansi@7.1.1"&gt;slice-ansi@7.1.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:color@5.0.1"&gt;color@5.0.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:color-string@2.1.1"&gt;color-string@2.1.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:simple-swizzle@0.2.3"&gt;simple-swizzle@0.2.3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:supports-hyperlinks@4.1.1"&gt;supports-hyperlinks@4.1.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:has-ansi@6.0.1"&gt;has-ansi@6.0.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:chalk-template@1.1.1"&gt;chalk-template@1.1.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="mailto:backslash@0.2.1"&gt;backslash@0.2.1&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;and more, I think. I suggest to read the original post published on aikido.dev[1] and related HN discussion[2], both links are reported below.&lt;/p&gt;
&lt;p&gt;All packages appear to contain &lt;em&gt;a piece of code that would be executed on the client of a website, which silently intercepts crypto and web3 activity in the browser, manipulates wallet interactions, and rewrites payment destinations so that funds and approvals are redirected to attacker-controlled accounts without any obvious signs to the user&lt;/em&gt; (as shared from Aikido).&lt;/p&gt;
&lt;p&gt;You can run grep or rg to check if your codebase has been impacted &amp;ndash; thanks to sindresorhus for this suggestion:&lt;/p&gt;
&lt;p&gt;&lt;code&gt;rg -u --max-columns=80 _0x112fa8&lt;/code&gt;&lt;/p&gt;
&lt;p&gt;This one requires ripgrep, but you can do the same with &lt;code&gt;grep&lt;/code&gt; (ripgrep its Rust equivalent redesign).&lt;/p&gt;
&lt;p&gt;My thoughts about this: dependency hell is real and these are the results. I agree with &lt;a href="https://x.com/mitchellh/status/1965409636024221901"&gt;Mitchell Hashimoto when he says that npm should adopt some strategies to mitigate these risks&lt;/a&gt;, such as rejecting all dependencies tha have less than 1k LoC. I mean, let&amp;rsquo;s just avoid using external packages to determine if an object can act like an array.&lt;/p&gt;
&lt;p&gt;Also, I would like to share one insight reported by DDerTyp on HN:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;One of the most insidious parts of this malware&amp;rsquo;s payload, which isn&amp;rsquo;t getting enough attention, is how it chooses the replacement wallet address. It doesn&amp;rsquo;t just pick one at random from its list.
It actually calculates the Levenshtein distance between the legitimate address and every address in its own list. It then selects the attacker&amp;rsquo;s address that is visually most similar to the original one.
This is a brilliant piece of social engineering baked right into the code. It&amp;rsquo;s designed to specifically defeat the common security habit of only checking the first and last few characters of an address before confirming a transaction.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Needs a little bit of more investigation, for which I don&amp;rsquo;t have enough time, but looks interesting.&lt;/p&gt;
&lt;p&gt;[1] &lt;a href="https://www.aikido.dev/blog/npm-debug-and-chalk-packages-compromised"&gt;Original post&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;[2] &lt;a href="https://news.ycombinator.com/item?id=45169657"&gt;Hacker News discussion&lt;/a&gt;&lt;/p&gt;</description></item><item><title>Fil-C, a memory safe implementation of C and C++</title><link>https://sibellavia.lol/notes/2025/09/06/fil-c-a-memory-safe-implementation-of-c-and-c/</link><pubDate>Sat, 06 Sep 2025 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/notes/2025/09/06/fil-c-a-memory-safe-implementation-of-c-and-c/</guid><description>&lt;p&gt;Yesterday, &lt;a href="https://fil-c.org/"&gt;Fil-C&lt;/a&gt; popped up to the top of &lt;a href="https://news.ycombinator.com/item?id=45133938"&gt;Hacker News&lt;/a&gt;. This time the submission got a fair amount of traction, sparking a lot of interest in the community, including a comment from Andrew Kelley. In fact, I’ve been interested in Fil-C for about a year already: my first submission on Hacker News was eight months ago. So I can say I’ve been actively following the project’s progress, also thanks to the activity of its creator, @filpizlo, on Twitter.&lt;/p&gt;
&lt;p&gt;Fil-C is a compiler that implements the C and C++ languages with a memory-safe approach. Recently, Filip has published more documentation about the Garbage Collector and about the capabilities he calls &amp;ldquo;InvisiCaps&amp;rdquo;, which are more related to pointer safety.&lt;/p&gt;
&lt;p&gt;Well, for me this is kind of a dream. I love the C language, it&amp;rsquo;s my favorite, but I admit I have some skill issues when it comes to memory management, though not because of the language itself, but rather due to my own code-writing proficiency, which could definitely be better. Recently, I’ve been exploring Rust and Zig precisely for this reason, and I’ve found myself appreciating Zig more than Rust because of its minimalism. Having a memory-safe implementation of C would therefore resolve a lot of the headaches caused by memory management.&lt;/p&gt;
&lt;p&gt;Fil-C seems like the sweet spot between academic research and pragmatic work. Beyond the documentation, there’s also &lt;a href="https://fil-c.org/programs_that_work"&gt;a list of programs already ported to Fil-C&lt;/a&gt;, showing that sometimes no code changes are required, and when they are, the effort is moderate.&lt;/p&gt;
&lt;p&gt;So, the next step for me is to dig deeper into the topic and try it out myself! In the meantime, I thought it would be fair to personally share what Filip is doing, because the project deserves much more attention than it’s currently getting, imo.&lt;/p&gt;</description></item><item><title>The more you fuck around, the more you find out</title><link>https://sibellavia.lol/notes/2025/08/13/the-more-you-fuck-around-the-more-you-find-out/</link><pubDate>Wed, 13 Aug 2025 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/notes/2025/08/13/the-more-you-fuck-around-the-more-you-find-out/</guid><description>&lt;p&gt;There is a very interesting law that I think is worth sharing:&lt;/p&gt;
&lt;figure class="blog-image"&gt;
&lt;img src="https://sibellavia.lol/images/the-more-law/the-more-law.png"
alt="The more you fuck around, the more you find out law"
loading="lazy"&gt;
&lt;/figure&gt;
&lt;p&gt;I will apply it more often.&lt;/p&gt;</description></item><item><title>I am Sicilian and I support Strait of Messina Bridge</title><link>https://sibellavia.lol/notes/2025/08/11/i-am-sicilian-and-i-support-strait-of-messina-bridge/</link><pubDate>Mon, 11 Aug 2025 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/notes/2025/08/11/i-am-sicilian-and-i-support-strait-of-messina-bridge/</guid><description>&lt;p&gt;I’m Sicilian, and I support building the Strait of Messina Bridge.&lt;/p&gt;
&lt;p&gt;Premise: I don’t vote for Matteo Salvini, the current Minister of Infrastructure and Transport and promoter of the project. Even though, to be fair, many before him have tried to start construction on the Bridge, since it’s been discussed ever since the post-World War II era. In modern times, both Silvio Berlusconi and Matteo Renzi (two politicians from opposing sides) have pushed the idea. So my support doesn’t come from political or ideological alignment, but from a pragmatic standpoint.&lt;/p&gt;
&lt;p&gt;I’m in favor of building the Bridge because I’m in favor of progress. We’re talking about a world-class engineering project: a suspension bridge with a single span of 3,300 meters (over two miles), the longest in the world. Naturally, it will bring together some of the brightest minds to contribute to its construction. We’re daring to attempt something unprecedented and structurally thrilling, and we’re doing it between two Italian regions, Sicily and Calabria, that have historically been intellectual and cultural cradles not only for Italy, but for the entire world. That excites me.&lt;/p&gt;
&lt;p&gt;It’s also an investment currently estimated at 13 billion euros. With projects of this scale, I doubt the final figure will really stay there, because you have to factor in circumstances and unforeseen events. So, to that number, you can probably add another 17 or 18 billion euros. On paper, it has all the potential to generate new jobs and opportunities, both for companies in the sector and beyond.&lt;/p&gt;
&lt;p&gt;A bridge means greater connectivity and logistical efficiency. Right now, traveling from Sicily to Calabria happens via ferry, something that has become an icon for us Sicilians. Yes, this system could be improved and strengthened. But it’s also true that it still represents an obstacle to the continuous flow of rail traffic between regions. That problem could be solved with the Bridge, which, according to the plan, will have two highway lanes and, in the middle, two railway tracks. This would allow high-speed trains to reach Sicily.&lt;/p&gt;
&lt;p&gt;These are some of the advantages that lead me to support the construction of the bridge, regardless of my political ideology, which has no influence on my opinion here. But what are the counterarguments from the public? Below the ones I read most often, with some of my thoughts.&lt;/p&gt;
&lt;p&gt;Seismic risk: according to the plan, the bridge will have to withstand earthquakes up to a certain level on the Richter scale. I expect the design to largely incorporate structural and tolerance systems for such events.&lt;/p&gt;
&lt;p&gt;Environmental impact: this factor would definitely need to be monitored, even though it has been overlooked in the construction of other major public works, not only in Italy but around the world. Speaking from ignorance, I imagine the terrain will need to be modified for the creation of the towers’ supporting foundations. I expect this to be done in compliance with existing environmental risk regulations.&lt;/p&gt;
&lt;p&gt;High costs: this is undeniably a very expensive project. What’s often not taken into account, however, are the long-term returns and the economic flow generated by indirect effects: creation of new businesses, jobs, and so on.&lt;/p&gt;
&lt;p&gt;I’m genuinely enthusiastic about the project. Above all, the concept of “connection” fascinates me and sparks my imagination.&lt;/p&gt;</description></item><item><title>Helm: what I like and dislike</title><link>https://sibellavia.lol/notes/2025/06/22/helm-what-i-like-and-dislike/</link><pubDate>Sun, 22 Jun 2025 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/notes/2025/06/22/helm-what-i-like-and-dislike/</guid><description>&lt;p&gt;I have been working with Helm for some time now and I&amp;rsquo;ve developed a love-hate relationship with it. It seems to have become the de-facto package manager for K8s, and there are good reasons for that. But like any tool, it comes with its own set of frustrations that can make you question your life choices. Some honest thoughts follow.&lt;/p&gt;
&lt;h2 id="what-i-like"&gt;What I like&lt;/h2&gt;
&lt;p&gt;I find Helm to be fairly simple to get started. For all its complexity under the hood, Helm has a gentle learning curve in my opinion. You can start deploying applications with basic &lt;code&gt;helm install&lt;/code&gt; commands, then gradually learn about values, dependencies, and templating (unlucky) as they need to.&lt;/p&gt;
&lt;p&gt;Dependency management is good. Helm dependencies solve this quite elegantly, handling resolution, download and installation.&lt;/p&gt;
&lt;p&gt;Multi-environment support lets you deploy the same chart with different configurations. It&amp;rsquo;s a good feature when you are forced to deal with multiple environments. Also, in that regard rollbacks sometimes save you. &lt;code&gt;helm rollback app 3&lt;/code&gt; and that&amp;rsquo;s it. You&amp;rsquo;re back at revision 3. It just works.&lt;/p&gt;
&lt;h2 id="what-i-dont-like"&gt;What I don&amp;rsquo;t like&lt;/h2&gt;
&lt;p&gt;At the same time, I think Helm has fundamental design flaws that make it increasingly unsuitable for managing complex applications in modern, stratified infrastructures. First one: Go templating system. It is Helm&amp;rsquo;s biggest strength and its greatest weakness. It offers immense flexibility which is necessary for complex applications, but it results in code that is hard to reason about. The syntax is unintuitive and verbose, error messages are cryptic and unhelpful. Which value is nil? What&amp;rsquo;s the context? Why can&amp;rsquo;t I just get a proper stacktrace? It&amp;rsquo;s not rare to end up debugging templates and ending up commenting out sections and re-rendering repeatedly. This happened because I guess it was a natural choice to inherit Go templates, since they are &amp;ldquo;native&amp;rdquo;. In any case, the fundamental issue isn&amp;rsquo;t just that Go templates are bad, it&amp;rsquo;s that templating YAML is inherently problematic. It often leads to indentation bugs that break file-parsing and difficulty validating templates before rendering.&lt;/p&gt;
&lt;p&gt;I don&amp;rsquo;t know if I am the only one, but I miss some kind of Drift detection logic. Someone manually edits a deployment and Helm has no idea. The next &lt;code&gt;helm upgrade&lt;/code&gt; might work, might partially fail, or might silently ignore the drift. The fact is that in complex installations with numerous microservices and dependencies, manual interventions are often necessary because Helm lacks native installation ordering. When one service fails to start while waiting for dependencies, the pragmatic solution is manual editing, but this silently breaks Helm&amp;rsquo;s understanding of your system state.&lt;/p&gt;
&lt;p&gt;Helm&amp;rsquo;s approach to secrets is &amp;ldquo;just base64 encode it and hope for the best.&amp;rdquo; Tools like Helm Secrets exist, but they feel like band-aids on a fundamental design issue.&lt;/p&gt;
&lt;p&gt;Helm is purely client-side and imperative, even if we consider it as partially-declarative. This goes back to point 3. Helm fires commands and just trusts that what it thinks is deployed actually matches reality. Also, everything requires manual intervention. In fact, many teams end up with (imho) awkward combinations: ArgoCD + Helm, Flux + Helm, or just Helm + CI/CD. The fact that we&amp;rsquo;re retrofitting declarative behavior onto an imperative tool shows just how much the ecosystem has evolved past Helm&amp;rsquo;s original assumption.&lt;/p&gt;
&lt;p&gt;Helm doesn&amp;rsquo;t manage an install order or readiness. Helm simply ensures the sub-charts are included in the final rendered manifests. There is absolutely no guarantee of install order or readiness. As I said in point 3 I think, you usually solve this with hacks like init containers, complex readiness probes, or by running multiple &lt;code&gt;helm install&lt;/code&gt; commands in a specific order. I don&amp;rsquo;t like it.&lt;/p&gt;
&lt;h2 id="the-alternative"&gt;The Alternative&lt;/h2&gt;
&lt;p&gt;I don&amp;rsquo;t have an alternative as of now. Also, I don&amp;rsquo;t think there could be a better, community-supported alternative to Helm. Not because I think it&amp;rsquo;s not feasible, quite the contrary! Helm is widely used at the enterprise level and is fully supported by the CNCF, so I just believe that an alternative must be truly worthwhile to justify a change. In any case, I believe that the next step beyond Helm is a native Kubernetes system that uses CRDs, is declarative, and imposes a package structure standard similar to Linux. I hope to be able to create a proof of concept in the future :-)&lt;/p&gt;</description></item><item><title>Recurrent Neural Networks (RNNs) explained</title><link>https://sibellavia.lol/posts/2024/02/recurrent-neural-networks-rnns-explained/</link><pubDate>Fri, 16 Feb 2024 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/posts/2024/02/recurrent-neural-networks-rnns-explained/</guid><description>&lt;p&gt;&lt;strong&gt;Recurrent Neural Networks (RNNs)&lt;/strong&gt; are a class of neural networks designed to process sequential data, such as time series, text, audio, or any other type of sequential data. RNNs were developed to overcome the limitations of feedforward networks that don&amp;rsquo;t maintain a memory of past information.&lt;/p&gt;
&lt;h2 id="from-feedforward-to-rnns"&gt;From Feedforward to RNNs&lt;/h2&gt;
&lt;p&gt;Let&amp;rsquo;s take a general look at &lt;strong&gt;Feedforward Networks,&lt;/strong&gt; to then better understand RNNs.&lt;/p&gt;
&lt;figure class="blog-image"&gt;
&lt;img src="https://sibellavia.lol/images/rnn-explained/fnn.png"
alt="A multi layer feed forward neural network"
loading="lazy"&gt;
&lt;/figure&gt;
&lt;p&gt;In feedforward networks, input is processed in a single pass, from input to output, without retaining any memory of previous inputs. Each input is treated independently from the others, making feedforward networks suboptimal for tasks requiring an understanding of data sequences or temporal contexts. This behavior is a direct consequence of the structure of a feedforward network: relatively simple and linear, with layers of neurons connecting directly one after the other in one direction, without cycles. Each layer receives input only from the previous layer and sends output only to the next layer. Feedforward networks are employed in specific areas, such as classification and regression tasks where the order of inputs isn&amp;rsquo;t relevant (image classification or predicting time-independent values).&lt;/p&gt;
&lt;figure class="blog-image"&gt;
&lt;img src="https://sibellavia.lol/images/rnn-explained/rnn.png"
alt="A Recurrent Neural Network"
loading="lazy"&gt;
&lt;/figure&gt;
&lt;p&gt;RNNs, unlike feedforward networks with unidirectional information flow and independent layer weights, feature recurrent connections enabling each hidden layer to be shaped by both the current input and the previous hidden state&amp;rsquo;s output. This creates an &lt;strong&gt;&amp;ldquo;internal memory&amp;rdquo;,&lt;/strong&gt; useful for processing data sequences by allowing the network to consider past inputs, thus handling temporal dependencies. Ideal for tasks like language modeling, speech recognition, and sequence generation, RNNs use this memory to manage the sequence context, a key difference from the simpler feedforward approach. We will see their graphical visualization later, but first, a mathematical digression is useful to better understand how an RNN works.&lt;/p&gt;
&lt;h2 id="rnns-key-equations"&gt;RNNs Key Equations&lt;/h2&gt;
&lt;p&gt;In RNNs, there are multiple key equations and important mathematical concepts to understand. However, we will focus on the two most comprehensive equations for an RNN, which specify all the necessary calculations for computation at each time step on the forward pass in a simple recurrent neural network.
$$h^{(t)}=\sigma(W^{hx}x^{(t)}+W^{hh}h^{t-1}+b_h)$$
This equation represents how the hidden state $h^{(t)}$ is updated at time $t$. Each hidden state is calculated based on three components:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The product of the weight matrix between the input and the hidden layer $W^{hx}$ and the input at time $t$, $x^{(t)}$.&lt;/li&gt;
&lt;li&gt;The product of the recurrent weight matrix between the hidden layer and itself at adjacent time steps $W^{hh}$ and the hidden state at the previous time $t-1$, $h^{(t-1)}$.&lt;/li&gt;
&lt;li&gt;The bias vector $b_h$.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The same equation can be interpreted in a slightly different and alternative way, as follows:&lt;/p&gt;
&lt;p&gt;$$h^{(t)}=f(h^{(t-1)}, x^{(t)}; \theta)$$&lt;/p&gt;
&lt;p&gt;These three terms are summed together and then passed through an activation function $\sigma$, such as the sigmoid function or the hyperbolic tangent, which updates the RNN&amp;rsquo;s hidden state $h^{(t)}$.&lt;/p&gt;
&lt;figure class="blog-image"&gt;
&lt;img src="https://sibellavia.lol/images/rnn-explained/rnn-equation.png"
alt="Equation plotted on the computational graph"
loading="lazy"&gt;
&lt;/figure&gt;
&lt;p&gt;The purpose of the activation function is to introduce non-linearity into the model, allowing the model itself to represent complex and non-linear relationships between input and output variables.&lt;/p&gt;
&lt;p&gt;The matrices mentioned in the formula are weight matrices that connect the previous hidden state to the current hidden state, the input to the hidden state, and the hidden state to the output.
$$\hat{y}=softmax(W^{yh}h^{(t)}+b_y)$$
This equation is a typical representation of the output phase in an RNN, where the output $\hat{y}^{(t)}$ at time $t$ is calculated using the softmax function. In detail:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;$\hat{y}^{(t)}$ is the output predicted by the network at time $t$,&lt;/li&gt;
&lt;li&gt;$W^{yh}$ is the weight matrix that connects the hidden state $h^{(t)}$ to the output,&lt;/li&gt;
&lt;li&gt;$h^{(t)}$ is the hidden state at time $t$,&lt;/li&gt;
&lt;li&gt;$b_y$ is the bias vector associated with the output,&lt;/li&gt;
&lt;li&gt;the softmax function is an activation function used in multi-class classifications to transform scores (logits) into probabilities.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In an RNN, the softmax output $\hat{y}^{(t)}$ is often used to determine the probability of each possible next element in a sequence, such as the next word in a text. It can be used to evaluate performance during training or to generate new sequences during the inference process.&lt;/p&gt;
&lt;h2 id="design-patterns-and-principles"&gt;Design Patterns and Principles&lt;/h2&gt;
&lt;p&gt;Based on the definition we have given of RNNs, and in light of their computational structure, we can identify the following principles or design patterns of RNNs:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Cyclic Structure&lt;/strong&gt;: Each unit of the RNN receives two inputs: the current element of the data sequence and the &amp;ldquo;hidden state&amp;rdquo; from the previous unit, which acts as a form of memory that carries information from one element to another in the sequence.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Hidden State&lt;/strong&gt;: The hidden state is the heart of RNNs, allowing the network to accumulate knowledge throughout the sequence. Recurrent hidden layers leverage a cyclic memory mechanism that makes them &amp;ldquo;stateful,&amp;rdquo; allowing the network to accumulate knowledge over the sequence. At each time step, the hidden state is updated based on both the current input and the previous hidden state. This feature ensures that, alongside the evolution of the hidden layers, a hidden state is also developed, which is crucial for the RNN&amp;rsquo;s ability to process sequential information.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Shared Parameters&lt;/strong&gt;: Unlike feedforward networks, where each layer has its own set of parameters, in an RNN, the same set of parameters (weights) is used at every time step. This concept is known as &amp;ldquo;shared parameters&amp;rdquo; and allows the RNN to process sequences of variable length with a fixed model.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sequential Structure&lt;/strong&gt;: RNNs are designed to work with sequential data, and their architecture reflects this characteristic. Inputs are processed one after the other, and the output from one step can influence the processing of the next.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Output&lt;/strong&gt;: At each time step, the RNN can produce an output based on the current hidden state. The output can be generated at each step (for example, in language modeling) or only at the end of the sequence (for example, in sequence classification).&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;With an understanding of the fundamental building blocks of RNNs, we now delve into their computational representation.&lt;/p&gt;
&lt;h2 id="computational-graph-and-backpropagation"&gt;Computational Graph and Backpropagation&lt;/h2&gt;
&lt;figure class="blog-image"&gt;
&lt;img src="https://sibellavia.lol/images/rnn-explained/rnn-unfolded.png"
alt="Computational Graph with an Unfolded RNN"
loading="lazy"&gt;
&lt;/figure&gt;
&lt;p&gt;It is important to delve deeper into the concept of &lt;strong&gt;unfolding&lt;/strong&gt;. In the context of RNNs, the term unfolding refers to the process of transforming the recurrent network, which intrinsically has a cyclic structure due to its hidden state passing from one time step to the next, into an extended (or unfolded) version that displays the entire sequence of operations across time steps. This unfolding transforms the cyclic structure into a chain of replicas of the network, one for each time step in the sequence, making evident how information flows through the time steps.&lt;/p&gt;
&lt;p&gt;Following the concept of unfolding, it&amp;rsquo;s essential to examine &lt;strong&gt;Backpropagation Through Time (BPTT):&lt;/strong&gt; it is an algorithm that emerges as a variant of the standard feedforward Backpropagation algorithm, specifically designed for the training of RNNs. This variation of the standard algorithm is the necessary consequence of the recurrent nature of RNNs, which requires a different approach for the calculation of gradients and for the updating of weights during the training process.&lt;/p&gt;
&lt;p&gt;The BPTT algorithm works by following these steps:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Unfolding the network: as we have previously seen, the RNN is unfolded over time to transform it into a feedforward network. This allows treating the temporal dependency between sequential inputs as connections within a larger network.&lt;/li&gt;
&lt;li&gt;Forward propagation: the input is fed through the unfolded network, and the output is calculated for each time step.&lt;/li&gt;
&lt;li&gt;Error calculation: the error (the difference between the expected value and the actual one) for each output in the sequence is calculated.&lt;/li&gt;
&lt;li&gt;Backward error propagation (Backpropagation): the error is propagated backward to calculate the gradients of the weights relative to the error. This is the most critical step for training, as it determines how to modify the weights to reduce the error.&lt;/li&gt;
&lt;li&gt;Weight update: finally, the weights are updated based on the calculated gradients. Stochastic Gradient Descent is typically used as the optimization algorithm.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Stochastic Gradient Descent (SGD) is a variant of the Gradient Descent algorithm, which is a technique for minimizing the cost (or loss) function associated with a model, i.e., the measure of how &amp;ldquo;far&amp;rdquo; the model is from making accurate predictions. I believe I will write a dedicated blog post on the topic. For now, we mentioned it only to connect to the next paragraph, where we will address the problem of Vanishing/Exploding Gradient, the major limitation of RNNs.&lt;/p&gt;
&lt;h2 id="vanishingexploding-gradient-dilemma"&gt;Vanishing/Exploding Gradient Dilemma&lt;/h2&gt;
&lt;p&gt;The problem of vanishing gradients and exploding gradients are two significant challenges in training RNNs, especially when working with very long sequences. These issues affect an RNN&amp;rsquo;s ability to learn long-term dependencies between elements in a sequence. Let&amp;rsquo;s look in detail at what these problems entail and how they manifest.&lt;/p&gt;
&lt;p&gt;The vanishing gradient problem occurs when the gradient (the derivative of the loss function with respect to the network&amp;rsquo;s weights) tends towards zero during the backpropagation process. In RNNs, this is particularly problematic for long-term dependencies because of the repeated multiplication of small gradients across timesteps during BPTT. This can result in a gradient that becomes extremely small. Consequently, the weight update becomes insignificant, effectively rendering the network unable to learn from inputs that are distant in time. For this reason, the network may struggle to learn and retain information from earlier parts of the sequence.&lt;/p&gt;
&lt;p&gt;Conversely, the exploding gradient problem occurs when gradients become excessively large during backpropagation. This can lead to overly large weight updates, causing instability in the network and making the training process diverge. In RNNs, this is often due to the repeated multiplication of large gradients across timesteps, which can exponentially increase the gradient&amp;rsquo;s value as it is backpropagated in time.&lt;/p&gt;
&lt;p&gt;To overcome this issue, various techniques have been adopted that led to the development of new neural networks, such as Long Short-Term Memory (LSTM) networks. LSTMs were designed as a specific type of RNN cell with additional gating mechanisms, allowing them to selectively remember or forget information based on its relevance. These gating mechanisms help preserve important long-term dependencies in the hidden state vector and enable LSTMs to better handle sequences with long-term dependencies than traditional RNNs.&lt;/p&gt;
&lt;h2 id="ending"&gt;Ending&lt;/h2&gt;
&lt;p&gt;I&amp;rsquo;ve tried to explain clearly and summarize the most important points about RNNs. I dedicated a blog post to RNNs because I consider them important and foundational for the study of subsequent models, like Transformers. Anyway, even though traditional RNNs face difficulties in capturing long-term dependencies, variants like LSTM and GRU have been developed to overcome these limits. To understand and use them, I believe it&amp;rsquo;s necessary to know how RNN architectures are constructed.&lt;/p&gt;
&lt;h2 id="sources"&gt;Sources&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="https://www.deeplearningbook.org/contents/rnn.html"&gt;https://www.deeplearningbook.org/contents/rnn.html&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/abs/1506.00019"&gt;https://arxiv.org/abs/1506.00019&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/1312.6026.pdf"&gt;https://arxiv.org/pdf/1312.6026.pdf&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://arxiv.org/pdf/1901.00434.pdf"&gt;https://arxiv.org/pdf/1901.00434.pdf&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.researchgate.net/publication/228394623_A_brief_review_of_feed-forward_neural_networks"&gt;https://www.researchgate.net/publication/228394623_A_brief_review_of_feed-forward_neural_networks&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>Applying Deep Learning to detect Rhegmatogenous Retinal Detachment</title><link>https://sibellavia.lol/posts/2020/06/applying-deep-learning-to-detect-rhegmatogenous-retinal-detachment/</link><pubDate>Sat, 20 Jun 2020 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/posts/2020/06/applying-deep-learning-to-detect-rhegmatogenous-retinal-detachment/</guid><description>&lt;p&gt;&lt;strong&gt;Retinal detachment is an eye disease and is one of the most serious ocular emergencies.&lt;/strong&gt; It occurs after a layer of the retina - essential tissue for vision - is lifted from the pigmented epithelial tissue, dragging the blood vessels that supply nutrients and the eye with it. If the retina is no longer nourished through contact with the pigment epithelium layer, cell death and a progressive and functional loss of vision or of the detached portion of the retina occur after 48 hours.&lt;/p&gt;
&lt;p&gt;However, the retina can be re-attached surgically, resulting in ocular improvement, as long as the detachment time is not excessive (Heussen N. et al., 2012). &lt;strong&gt;Below it is demonstrated how machine learning and deep learning techniques can play a crucial role in quickly saving the eyes of a patient with Rhegmatogenous Retinal Detachment.&lt;/strong&gt;&lt;/p&gt;
&lt;h2 id="retinal-detachment-types-and-consequences"&gt;Retinal detachment: types and consequences&lt;/h2&gt;
&lt;p&gt;There are four types of retinal detachment:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Rhegmatogenous Retinal Detachment (RRD):&lt;/strong&gt; the most frequent, is due to a break in the retina that allows the vitreous, gelatinous liquid that is inside the eye, to enter the subretinal space, thus allowing the detachment of its adherent portion;&lt;/li&gt;
&lt;li&gt;Tractional: typical in cases of retinal ischemia, it is generated by fibro-vascular tissue bridles that exert a centrifugal traction on the retina capable of ungluing it;&lt;/li&gt;
&lt;li&gt;Exudative: it is due to the presence of fluid under the retina following inflammation or vascular lesions;&lt;/li&gt;
&lt;li&gt;Mixed forms.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Rhegmatogenous Retinal Detachment is a serious condition that can lead to blindness, nevertheless the possibility of cure is directly proportional to the timeliness of appropriate treatment that anticipates the period of cell death of the retina. Early diagnosis and rapid intervention play a fundamental role, and in this case &lt;strong&gt;Deep Learning techniques represent an aid capable of meeting these needs.&lt;/strong&gt; In the study by Ohsugi and collaborators, Machine Learning technologies are applied to detect RRD, using &lt;strong&gt;ultra-wide field fundus images and studying their performance&lt;/strong&gt; (Ohsugi, H., Tabuchi, H., Enno, H. et al., 2017).&lt;/p&gt;
&lt;p&gt;These images can be realized thanks to techniques conducted by means of &lt;strong&gt;scanning laser ophthalmoscope (SLO),&lt;/strong&gt; an instrument used for the evaluation of the fundus of the retina. It was designed to capture images of the retinal layers simultaneously with confocal images of the fundus (W H Woon, F W Fitzke, A C Bird, J Marshall, 1992). Wide field fundus images can be acquired easily without pupillary mydriasis and without causing medical complications, so even examiners not qualified to perform ophthalmological surgeries can acquire the images in a completely safe way, making this instrument ideal especially in cases of emergency in which ophthalmologists are not available.&lt;/p&gt;
&lt;p&gt;&lt;figure class="blog-image"&gt;
&lt;img src="https://sibellavia.lol/images/posts/deep-learning-rrd/fundus.png"
alt="Representative fundus images obtained by ultra wide field scanning laser ophthalmoscopy. Ultra wide field right fundus images without rhegmatogenous retinal detachment (RRD) (a) and with RRD (b). The arrow indicates the retinal break and the arrowheads indicate the areas of RRD."
loading="lazy"&gt;
&lt;/figure&gt;
&lt;em&gt;Representative fundus images obtained by ultra wide field scanning laser ophthalmoscopy. Ultra wide field right fundus images without rhegmatogenous retinal detachment (RRD) (a) and with RRD (b). The arrow indicates the retinal break and the arrowheads indicate the areas of RRD.&lt;/em&gt;&lt;/p&gt;
&lt;h2 id="machine-learning-and-rrd-early-diagnosis"&gt;Machine Learning and RRD early diagnosis&lt;/h2&gt;
&lt;p&gt;The study evaluates precisely the ability of a deep learning technology to detect RRD using images obtained through SLO.&lt;/p&gt;
&lt;p&gt;A Machine Learning model uses a &lt;strong&gt;multilayer CNN capable of automatically learning the characterizing patterns of the images&lt;/strong&gt; and making them a classification system (Deng, J. et al., 2009).&lt;/p&gt;
&lt;p&gt;&lt;figure class="blog-image"&gt;
&lt;img src="https://sibellavia.lol/images/deep-learning-rrd/ml.png"
alt="CNN architecture used. The Input is represented by the RGB 96x96 pixel image. Each of the convolutional layers (Conv1–3) is followed by an activation function layer (ReLU), pooling layers (MP1–3), and two fully connected layers (FC1, FC2)."
loading="lazy"&gt;
&lt;/figure&gt;
&lt;em&gt;CNN architecture used. The Input is represented by the RGB 96x96 pixel image. Each of the convolutional layers (Conv1–3) is followed by an activation function layer (ReLU), pooling layers (MP1–3), and two fully connected layers (FC1, FC2).&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The convolutional level obtains the characteristics of the input through convolutional filters, the maximum levels of pooling (MP1, MP2 and MP3) allow a more generic recognition. The last two layers (FC1, FC2) are completely connected and remove spatial information from the quantities of extracted features and statistically recognize the objective that CNN aims to achieve. For the training of the neural network, &lt;strong&gt;100 images were processed,&lt;/strong&gt; in addition an optimization algorithm called AdaGrad was implemented to correctly train the network weights.&lt;/p&gt;
&lt;p&gt;411 images from 407 RRD patients and 420 images from 238 non-RRD patients were used. The deep learning model demonstrated a high sensitivity of 97.6% [95% confidence interval (CI), 94.2-100%] and a high specificity of 96.5%. &lt;strong&gt;The model used, therefore, can improve medical care and increase the timeliness of intervention, resulting in an early diagnosis that can prevent RRD-derived blindness.&lt;/strong&gt;&lt;/p&gt;
&lt;h3 id="bibliography"&gt;Bibliography&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;Heussen N. et al. (2012). Scleral buckling versus primary vitrectomy in rhegmatogenous retinal detachment study (SPR study): Risk assessment of anatomical outcome. SPR study report no. 7. Acta Ophthalmologica.&lt;/li&gt;
&lt;li&gt;Ohsugi, H., Tabuchi, H., Enno, H. et al. (2017). Accuracy of deep learning, a machine-learning technology, using ultra–wide-field fundus ophthalmoscopy for detecting rhegmatogenous retinal detachment. Scientific Reports, Article number: 9425.&lt;/li&gt;
&lt;li&gt;W H Woon, F W Fitzke, A C Bird, J Marshall. (1992). Confocal imaging of the fundus using a scanning laser ophthalmoscope. British Journal of Ophthalmology.&lt;/li&gt;
&lt;li&gt;Deng, J. et al. (2009). Imagenet: a large-scale hierarchical image database. Computer Vision and Pattern Recognition. IEEE Conference on Computer Vision and Pattern Recognition, 248–255.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>A generic introduction to Clinical Decision Support Systems</title><link>https://sibellavia.lol/posts/2020/06/a-generic-introduction-to-clinical-decision-support-systems/</link><pubDate>Sat, 13 Jun 2020 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/posts/2020/06/a-generic-introduction-to-clinical-decision-support-systems/</guid><description>&lt;p&gt;A &lt;strong&gt;Clinical Decision Support System&lt;/strong&gt; is defined as an &amp;ldquo;active knowledge system, which uses two or more patient data elements to generate case-specific advice&amp;rdquo; (DSS, 2001). This implies that a CDSS is simply a Decision Support System focused on the use of &lt;strong&gt;knowledge management&lt;/strong&gt; in order to obtain &lt;strong&gt;clinical advice for patient care&lt;/strong&gt; based on multiple variables inherent to it. The main purpose of the modern CDSS, therefore, is &lt;strong&gt;to assist doctors in patient treatment cases&lt;/strong&gt; or in emergency situations (Benner, 2007).&lt;/p&gt;
&lt;h2 id="cdss-an-introduction"&gt;CDSS: an introduction&lt;/h2&gt;
&lt;p&gt;As anticipated, at the beginning the CDSS were conceived to literally take important decisions, effectively replacing the figure of the doctor: the latter entered the information and waited for the CDSS to provide the &amp;ldquo;right&amp;rdquo; choice, so as to be able to simply act on that. However, today a CDSS only provides suggestions for the doctor, who examines and selects the informations provided, keeping the useful suggestions and discarding those he deems unsuitable (DSS, 2001). An important example of CDSS is certainly &lt;strong&gt;CBR, Case-based Reasoning&lt;/strong&gt; (Shahina Begum, Mobyen Uddin Ahmed, Peter Funk, N. Xiong, Mia Folke, 2001), which uses data from previous clinical cases to &lt;strong&gt;indicate a specific health treatment.&lt;/strong&gt; A CBR system was developed to facilitate radiotherapy treatment planning for brain cancer: given a new patient case, the CBR system retrieves a similar case from an archive of patients successfully treated with the suggested treatment plan, re-adapting it to meet the specific needs of the new case - the results obtained using real-world brain cancer patient cases have shown that &lt;strong&gt;the success rate of the new CBR recovery is higher than that of the original system&lt;/strong&gt; (Khussainova, Gulmira; Petrovic, Sanja; Jagannathan, Rupa, 2015).&lt;/p&gt;
&lt;h2 id="a-clinical-decision-support-system-for-prevention-of-venous-thromboembolism-effect-on-physician-behavior"&gt;A Clinical Decision Support System for Prevention of Venous Thromboembolism: Effect on Physician Behavior&lt;/h2&gt;
&lt;p&gt;In the study by Durieux et al. &lt;strong&gt;the effect of a CDSS in physician behavior&lt;/strong&gt; during clinical decisions for the prevention of venous thromboembolism was investigated (Durieux P, Nizard R, Ravaud P, Mounier N, Lepage E, 2000). In hospital clinical practice, venous thromboembolism remains a serious problem and pulmonary embolism is a leading cause of death (Frederick A. Anderson, H. Brownell Wheeler, Robert J. Goldberg, 1991). The most effective way to prevent fatal and non-fatal venous thromboembolism is to &lt;strong&gt;use routine prophylaxis for high-risk patients.&lt;/strong&gt; Optimal decisions on the use of anticoagulants in the prevention of venous thromboembolism require access to a large amount of complex information to assess the degree of risk of hospitalized patients. The objective of the study was to determine whether the presentation of guidelines for the prophylaxis of venous thromboembolism using a CDSS increased the percentage of appropriate clinical practice decisions made regarding the proportion of anticoagulant prescription.&lt;/p&gt;
&lt;p&gt;The study took place in time series between December 1997 and July 1999 in an orthopedic surgery department of a university hospital located in Paris, on a sample of 1971 patients undergoing orthopedic surgery. A CDSS designed to provide immediate information related to the prevention of venous thromboembolism among surgical patients was integrated into daily medical practice during three 10-week intervention periods, alternating with four 10-week control periods, with a 4-week washout between each period. The proportion of appropriate prescriptions for anticoagulants in line with the established clinical protocols was then analyzed during the intervention periods, and then compared with the control group. &lt;strong&gt;The results show the importance of using the CDSS:&lt;/strong&gt; doctors complied with the guidelines in 82.8% of cases during the control periods and in 94.9% of cases during the intervention periods. During each intervention period, &lt;strong&gt;the adequacy of prescribing increased significantly,&lt;/strong&gt; and each time the CDSS was removed, the medical practice reverted to that observed prior to the intervention. It can be confirmed that the implementation of clinical guidelines for venous thromboembolism prophylaxis through a CDSS used and integrated into the hospital information system &lt;strong&gt;has changed physician behavior and improved compliance with the guidelines.&lt;/strong&gt;&lt;/p&gt;
&lt;h3 id="bibliography"&gt;Bibliography&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;DSS. (2001). &lt;a href="https://web.archive.org/web/20010715105143/http:/www.openclinical.org/dss.html"&gt;Open Clinical - Knowledge Management for Medical Care&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Benner, E. (2007). Clinical Decision Support Systems. Springer.&lt;/li&gt;
&lt;li&gt;Shahina Begum, Mobyen Uddin Ahmed, Peter Funk, N. Xiong, Mia Folke. (2001). Case-Based Reasoning Systems in the Health Sciences: A Survey of Recent Trends and Developments. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions, 421-434.&lt;/li&gt;
&lt;li&gt;Khussainova, Gulmira; Petrovic, Sanja; Jagannathan, Rupa. (2015). Retrieval with clustering in a case-based reasoning system for radiotherapy treatment planning. Journal of Physics: Conference Series.&lt;/li&gt;
&lt;li&gt;Durieux P, Nizard R, Ravaud P, Mounier N, Lepage E. (2000). A Clinical Decision Support System for Prevention of Venous Thromboembolism: Effect on Physician Behavior. JAMA.&lt;/li&gt;
&lt;li&gt;Frederick A. Anderson, H. Brownell Wheeler, Robert J. Goldberg. (1991). A Population-Based Perspective of the Hospital Incidence and Case-Fatality Rates of Deep Vein Thrombosis and Pulmonary Embolism. JAMA, 933-938.&lt;/li&gt;
&lt;/ol&gt;</description></item><item><title>About</title><link>https://sibellavia.lol/about/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/about/</guid><description>&lt;p&gt;I&amp;rsquo;m Simone Bellavia. I&amp;rsquo;m passionate about software engineering, deep learning and distributed systems.&lt;/p&gt;
&lt;p&gt;On this weblog, I share my thoughts on various topics that inspire me: from technical insights in AI and machine learning to reflections on building software.&lt;/p&gt;
&lt;h2 id="connect"&gt;Connect&lt;/h2&gt;
&lt;p&gt;You can find me on &lt;a href="https://twitter.com/sibellavia"&gt;Twitter&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Thanks for stopping by!&lt;/p&gt;</description></item><item><title>Archive</title><link>https://sibellavia.lol/archive/</link><pubDate>Mon, 01 Jan 0001 00:00:00 +0000</pubDate><guid>https://sibellavia.lol/archive/</guid><description/></item></channel></rss>