-
Lovely visual demonstrations of the B-tree and B+tree indexing data structures, used in MySQL, Postgres, MongoDB, Dynamo, and many others
(tags: algorithms indexes indexing databases storage b-trees b+trees binary-search-trees data-structures)
Justin's Linklog Posts
Fixing aggressive Xiaomi battery management
I’ve been using a Xiaomi phone recently, running Xiaomi HyperOS 1.011.0, and one feature that bugs me constantly is that apps lose state as soon as you flip away to another app, even if only for a second; once you flip back, the app restarts. This appears to be an aspect of Xiaomi’s built in power management. I’ve been searching for a way to disable it, and allow multiple apps in memory simultaneously, and I’ve finally tracked it down. As described here, https://piunikaweb.com/2021/04/19/miui-optimization-missing-in-developer-options-try-this-workaround/ , you need to enable Developer Mode on the phone, enter “Additional Settings” / “Developer options”, then scroll all the way down, nearly to the bottom, to “Reset to default values”. Hit this _repeatedly_ (once is not enough!) until another option appears just below, called either “Turn on MIUI optimisation” or, in my case, “Turn on system optimisation”; this is enabled by default. Turn it off. In my case, this has fixed the flipping-between-apps problem, the phone in general is significantly snappier to respond, and WhatsApp and Telegram new-message notifications don’t get auto-dismissed (which was another annoying feature previously). I suspect a load of battery optimisations and CPU throttling has been disabled. It remains to be seen what this does to my battery life, but hopefully it’ll be worth it, and it’ll be nice not to lose state in Chrome forms when I have to flip over to my banking app, etc. I won’t be getting another Xiaomi phone after this; there are numerous rough edges and outright bugs in the MIUI/HyperOS platform, at least in the international ROM images, and there’s no support or documentation to work around this stuff. It’s a crappy user experience.
(tags: phones mobile xiaomi miui workarounds battery options settings)
-
A thought-provoking article:
Philip Agre enumerated five characteristics of data that will help us achieve this repositioning. Agre argued that “living data” must be able to express 1. a sense of ownership, 2. error bars, 3. sensitivity, 4. dependency, and 5. semantics. Although he originally wrote this in the early 1990s, it took some time for technology and policy to catch up. I’m going to break down each point using more contemporary context and terminology: Provenance and Agency: what is the origin of the data and what can I do with it (ownership)? Accuracy: has the data been validated? If not, what is the confidence of its correctness (error bars)? Data Flow: how is data discovered, updated, and shared (sensitivity to changes)? Auditability: what data and processes were used to generate this data (dependencies)? Semantics: what does this data represent?
(tags: culture data identity data-protection data-privacy living-data open-data)
Ethical Applications of AI to Public Sector Problems
Jacob Kaplan-Moss:
There have been massive developments in AI in the last decade, and they’re changing what’s possible with software. There’s also been a huge amount of misunderstanding, hype, and outright bullshit. I believe that the advances in AI are real, will continue, and have promising applications in the public sector. But I also believe that there are clear “right” and “wrong” ways to apply AI to public sector problems.
He breaks down AI usage into “Assistive AI”, where AI is used to process and consume information (in ways or amounts that humans cannot) to present to a human operator, versus “Automated AI”, where the AI both processes and acts upon information, without input or oversight from a human operator. The latter is unethical to apply in the public sector.(tags: ai ethics llm genai public-sector government automation)
Evading wireless tether speed caps
Handy tip from Brian Krebs – if you are tethering to a mobile phone, and network speeds are mysteriously limited, it may be your provider is throttling tethering. Changing the TTL may help, since some providers in the US at least are using a really stupid mechanism to detect tethering
(tags: tethering mobile wireless networking ttl via:brian-krebs networks)
Patent troll Sable pays up, dedicates all its patents to the public
This is a massive victory for Cloudflare — way to go!
Sable initially asserted around 100 claims from four different patents against Cloudflare, accusing multiple Cloudflare products and features of infringement. Sable’s patents — the old Caspian Networks patents — related to hardware-based router technologies common over 20 years ago. Sable’s infringement arguments stretched these patent claims to their limits (and beyond) as Sable tried to apply Caspian’s hardware-based technologies to Cloudflare’s modern software-defined services delivered on the cloud. […] Cloudflare fought back against Sable by launching a new round of Project Jengo, Cloudflare’s prior art contest, seeking prior art to invalidate all of Sable’s patents. In the end, Sable agreed to pay Cloudflare $225,000, grant Cloudflare a royalty-free license to its entire patent portfolio, and to dedicate its patents to the public, ensuring that Sable can never again assert them against another company.
(via AJ Stuyvenberg)(tags: sable cloudflare patent-trolls patents uspto trolls routing)
-
“Interactive browser-based web archiving from Webrecorder. The ArchiveWeb.page browser extension and standalone application allows you to capture web archives interactively as you browse. After archiving your webpages, your archives can be viewed using ReplayWeb.page — no extension required! For those who need to crawl whole websites with automated tools, check out Browsertrix.” This is a nice way to archive a personal dynamic site online in a read-only fashion — there is a self-hosting form of the replayer at https://replayweb.page/docs/embedding/#self-hosting . As @david302 on the Irish Tech Slack notes: “you can turn on recording, browse the (public) site you want to archive, get the .wacz file and stick that+js on s3/cloudfront.”
(tags: archiving archival archives tools web recording replay via:david302)
Turning Everyday Gadgets into Bombs is a Bad Idea
Bunnie Huang investigates the Mossad pager bomb’s feasibility, and finds it deeply worrying:
I am left with the terrifying realization that not only is it feasible, it’s relatively easy for any modestly-funded entity to implement. Not just our allies can do this – a wide cast of adversaries have this capability in their reach, from nation-states to cartels and gangs, to shady copycat battery factories just looking for a big payday (if chemical suppliers can moonlight in illicit drugs, what stops battery factories from dealing in bespoke munitions?). Bottom line is: we should approach the public policy debate around this assuming that someday, we could be victims of exploding batteries, too. Turning everyday objects into fragmentation grenades should be a crime, as it blurs the line between civilian and military technologies.
(tags: batteries israel security terrorism mossad pagers hardware devices bombs)
Modal interfaces considered harmful
A great line from the 99 Percent Invisible episode titled “Children of the Magenta (Automation Paradox, pt. 1)”, regarding the Air France flight 447 disaster:
When one of the co-pilots hauled back on his stick, he pitched the plane into an angle that eventually caused the stall. […] it’s possible that he didn’t understand that he was now flying in a different mode, one which would not regulate and smooth out his movements. This confusion about what how the fly-by-wire system responds in different modes is referred to, aptly, as “mode confusion,” and it has come up in other accidents.
(tags: automation aviation flying modal-interfaces ui ux interfaces modes mode-confusion air-france-447 disasters)
-
wordfreq is “a Python library for looking up the frequencies of words in many languages, based on many sources of data.” Sadly, it’s now longer going to be updated, as the author writes:
I don’t want to be part of this scene anymore: wordfreq used to be at the intersection of my interests. I was doing corpus linguistics in a way that could also benefit natural language processing tools. The field I know as “natural language processing” is hard to find these days. It’s all being devoured by generative AI. Other techniques still exist but generative AI sucks up all the air in the room and gets all the money. It’s rare to see NLP research that doesn’t have a dependency on closed data controlled by OpenAI and Google, two companies that I already despise. wordfreq was built by collecting a whole lot of text in a lot of languages. That used to be a pretty reasonable thing to do, and not the kind of thing someone would be likely to object to. Now, the text-slurping tools are mostly used for training generative AI, and people are quite rightly on the defensive. If someone is collecting all the text from your books, articles, Web site, or public posts, it’s very likely because they are creating a plagiarism machine that will claim your words as its own. So I don’t want to work on anything that could be confused with generative AI, or that could benefit generative AI. OpenAI and Google can collect their own damn data. I hope they have to pay a very high price for it, and I hope they’re constantly cursing the mess that they made themselves.
(tags: ai language llm nlp openai scraping words genai google)
Nevada’s genAI-driven unemployment benefits system
As has been shown many times before, current generative AI systems encode bias and racism in their training data. This is not going to go well:
“There’s no AI [written decisions] that are going out without having human interaction and that human review,” DETR’s director told the website. “We can get decisions out quicker so that it actually helps the claimant.” […] “The time savings they’re looking for only happens if the review is very cursory,” explained Morgan Shah, the director of community engagement for Nevada Legal Services. “If someone is reviewing something thoroughly and properly, they’re really not saving that much time.” Ultimately, Shah said, workers using the system to breeze through claims may end up “being encouraged to take a shortcut.” […] As with most attempts at using this still-nascent technology in the public sector, we probably won’t know how well the Nevada unemployment AI works unless it’s shown to be doing a bad job — which feels like an experiment being conducted on some of the most vulnerable members of society without their consent.
Of course, the definition of a “bad job” depends who’s defining it. If the system is processing a high volume of applications, it may not matter to its operators if it’s processing them _correctly_ or not.(tags: generative-ai ai racism bias nevada detr benefits automation)
-
Significant changes in transparency requirements for EU-based datacenter operations:
Sunday September 15th was the deadline for every single organisation in Europe operating a datacentre of more than 500 KW, to publicly disclose: how much electricity they used in the last year; how much power came from renewable sources, and how much of this relied on the company buying increasingly controversial ‘unbundled’ renewable energy credits; how much water they used; and many more datapoints […] Where this information is being disclosed, in the public domain, and discoverable, [the Green Web Foundation] intend to link to it and make it easier to find. [….] There are some concessions for organisations that have classed this information as a trade secret or commercially confidential. In this case there is a second law passed, the snappily titled Commission Delegated Regulation (EU) 2024/1364, that largely means these companies need to report this information too, but to the European Commission instead. There will be a public dataset published based on this reporting released next year, containing data an agreggated level.
(tags: datacenter emissions energy sustainability gwf via:chris-adams eu europe ec)
Over the past decade or so, I’ve been suffering with chronic migraine, sometimes with multiple attacks per week. It’s been a curse — not only do you have to suffer the periodic migraine attacks, but also the “prodrome”, where unpleasant symptoms like brain fog and an inability to concentrate can impact you.
After a long process of getting a referral to the appropriate headache clinic, and eliminating other possible medications, I finally got approved to receive Ajovy (fremanezumab), one of the new generation of CGRP inhibitor monoclonals — these work by blocking the action of a peptide on receptors in your brain. I started the course of these a month ago.
The results have, frankly, been amazing. As I hoped, the migraine episodes have reduced in frequency, and in impact; they are now milder. But on top of that, I hadn’t realised just how much impact the migraine “prodrome” had been having on my day-to-day life. I now have more ability to concentrate, without it causing a headache or brain fog; I have more energy and am less exhausted on a day-to-day basis; judging by my CPAP metrics, I’m even sleeping better. It is a radical improvement. After 10 years I’d forgotten what it was like to be able to concentrate for prolonged periods!
They are so effective that the American Headache Society is now recommending them as a first-line option for migraine prevention, ahead of almost all other treatments.
If you’re a migraine sufferer, this is a game changer. I’m delighted. It seems there may even be further options of concomitant treatment with other CGRP-targeting medications in the future, to improve matters further.
More papers on the topic: a real-world study on CGRP inhibitor effectiveness after 6 months; no “wearing-off” effect is expected.
Faster zlib/DEFLATE decompression on the Apple M1 (and x86)
Some decent low-level performance hacking on arm64/x86 (via Tony Finch)
(tags: via:fanf compression deflate optimization assembly c optimisation hacks)
Aesthetic Visual Analysis at Netflix
Good blog post about Netflix’ automated cover-shot generation using Aesthetic Visual Analysis; I’ve been meaning to hack around with this
(tags: aesthetics ava images analysis netflix algorithms)
-
by Gergely Orosz and Lou Franco:
Q: “I’d like to make a better case for paying down tech debt on my team. What are some proven approaches for this?” The tension in finding the right balance between shipping features and paying down accumulated tech debt is as old as software engineering. There’s no one answer on how best to reduce tech debt, and opinion is divided about whether zero tech debt is even a good thing to aim for. But approaches for doing it exist which work well for most teams. To tackle this eternal topic, I turned to industry veteran Lou Franco, who’s been in the software business for over 30 years as an engineer, EM, and executive. He’s also worked at four startups and the companies that later acquired them; most recently Atlassian as a Principal Engineer on the Trello iOS app.
Apparently Lou has a book on the topic imminent.(tags: programming refactoring coding technical-debt tech-debt lou-franco software)
Irish Data Protection Commission launches inquiry into Google AI
The Data Protection Commission (DPC) today announced that it has commenced a Cross-Border[1] statutory inquiry into Google Ireland Limited (Google) under Section 110 of the Data Protection Act 2018. The statutory inquiry concerns the question of whether Google has complied with any obligations that it may have had to undertake an assessment, pursuant to Article 35[2] of the General Data Protection Regulation (Data Protection Impact Assessment), prior to engaging in the processing of the personal data of EU/EEA data subjects associated with the development of its foundational AI model, Pathways Language Model 2 (PaLM 2). A Data Protection Impact Assessment (DPIA)[3], where required, is of crucial importance in ensuring that the fundamental rights and freedoms of individuals are adequately considered and protected when processing of personal data is likely to result in a high risk.
Great to see this. If this inquiry results in some brakes on the widespread misuse of “fair use” in AI scraping, particularly where it concerns European citizens, I’m all in favour.(tags: eu law scraping fair-use ai dpia dpc data-protection privacy gdpr)
Why, after 6 years, I’m over GraphQL
Decent description of the problems with using GraphQL in a public API, vs a JSON REST approach
Amazon S3 now supports strongly-consistent conditional writes
This is a bit of a gamechanger: “Amazon S3 adds support for conditional writes that can check for the existence of an object before creating it. This capability can help you more easily prevent applications from overwriting any existing objects when uploading data. You can perform conditional writes using PutObject or CompleteMultipartUpload API requests in both general purpose and directory buckets. Using conditional writes, you can simplify how distributed applications with multiple clients concurrently update data in parallel across shared datasets. Each client can conditionally write objects, making sure that it does not overwrite any objects already written by another client. This means you no longer need to build any client-side consensus mechanisms to coordinate updates or use additional API requests to check for the presence of an object before uploading data. Instead, you can reliably offload such validations to S3, enabling better performance and efficiency for large-scale analytics, distributed machine learning, and other highly parallelized workloads. To use conditional writes, you can add the HTTP if-none-match conditional header along with PutObject and CompleteMultipartUpload API requests. This feature is available at no additional charge in all AWS Regions, including the AWS GovCloud (US) Regions and the AWS China Regions.”
(tags: s3 aws conditional-writes distcomp architecture storage consistency)
AI worse than humans at summarising information, trial finds
Human summaries ran up the score by significantly outperforming on identifying references to ASIC documents in the long document, a type of task that the report notes is a “notoriously hard task” for this type of AI. But humans still beat the technology across the board. Reviewers told the report’s authors that AI summaries often missed emphasis, nuance and context; included incorrect information or missed relevant information; and sometimes focused on auxiliary points or introduced irrelevant information. Three of the five reviewers said they guessed that they were reviewing AI content. The reviewers’ overall feedback was that they felt AI summaries may be counterproductive and create further work because of the need to fact-check and refer to original submissions which communicated the message better and more concisely.
(tags: ai government llms summarisation asic llama2-70b)
1 in 50 brits have long COVID, according to new study
That is a shocking figure.
In the new paper, researchers from the Nuffield Department of Primary Care Health Sciences, in collaboration with colleagues from the Universities of Leeds and Arizona, analysed dozens of previous studies into Long COVID to examine the number and range of people affected, the underlying mechanisms of disease, the many symptoms that patients develop, and current and future treatments. They found: Long COVID affects approximately 1 in 50 people in UK and a similar or higher proportion in many other countries; People of any age, gender and ethnic background can be affected; Long COVID results from complex biological mechanisms, which lead to a wide range of symptoms including fatigue, cognitive impairment / ‘brain fog’, breathlessness and pain; Long COVID may persist for years, causing long-term disability; There is currently no cure, but research is ongoing; Risk of Long COVID can be reduced by avoiding infection (e.g., by ensuring COVID vaccines and boosters are up to date and wearing a well-fitted high filtration mask) and taking antivirals promptly if infected.
(tags: long-covid covid-19 medicine health disease uk trish-greenhaigh)
the “Old Friends” immunology hypothesis
How the “Old Friends” hypothesis is taking over from the hygiene hypothesis:
Homo Sapiens first evolved some 300,000 years ago, yet crowd infections are believed to have only developed in the last 12,000 years, a small blip in human history. Humans living in dense cities is a relatively recent development. An even more recent development is that of sealed indoor spaces and frequent international air travel. Many crowd infections, such as measles, mumps, chickenpox, colds, and flu, are airborne, spreading when humans talk and breathe in close contact, with poor ventilation. These infections could not widely spread until the last few hundred years of human history. When I began studying immunology, something that surprised me is how much of the immune system is focused on fighting parasites. There is an entire branch, including several cell types, devoted to this. It seems like such a mismatch to the modern, industrialized world. “Can I have a few more immune cell types focused on viruses or intracellular bacteria?” I thought, “in exchange for some of these parasite-focused cells that I’m not using?” Our “old friends” are quite different from the crowd infections that plague us now – it would be bizarre to assume that research based on one of these categories will apply to the other! Our “old friends”, parasitic worms and beneficial microbes, are associated with a reduced risk of allergies and autoimmune diseases. No such relationship exists for crowd diseases. In fact, the opposite is true. Crowd diseases contribute to allergies and autoimmune diseases. Comparing the immune system to a muscle that gets stronger with use is overly simplistic and, in many cases, inaccurate. There is huge variety in how various pathogens impact us. Being precise in considering different types of microbes and infections will allow us to better understand human health.
(tags: articles health medicine immunology old-friends hygiene-hypothesis allergies autoimmune disease parasites)
-
Interesting: “We help you find European alternatives for digital service and products, like cloud services and SaaS products.”
Why heroism is bad, and what we can do to stop it
“What is heroism [in an SRE team]? Why is “the Hero” a bad role to have on a team? In this article, learn about how to build your team to avoid heroism, and when heroics can indeed be useful.” Nice short preso on the negative role of “the Hero”, aka the “Hero Coder Syndrome” (via namcat)
(tags: via:namcat heroism hero-coder-syndrome sre ops oncall systems teams work emergencies)
Evaluating persistent, replicated message queues
This is exhaustive! Kafka, Postgres, mongodb, Redis, Pulsar, SQS, EventStore, RocketMQ, RabbitMQ, ActiveMQ, and RedPanda all compared as backends for a persistent, replicated message queueing system. SQS actually fares quite well
(tags: activemq kafka rabbitmq messaging queueing message-queues sqs postgres storage)
-
The Operational Program for Exchange of Weather Radar Information (OPERA) from the European National Meteorological Services (EUMETNET) — 1km-square resolution open data of current precipitation levels over Ireland and the rest of Europe, with a 5 minute latency and granularity. May be useful for a project I’m thinking of… Also related, AROME immediate forecasts: https://portail-api.meteofrance.fr/web/api/AROME-PI
(tags: eumetnet meteorology weather rainfall rain forecasting eu europe ireland)
-
The new hotness in home self-hosting microservers — a full mini PC in the form factor of a USB hub, using Intel’s N100 platform. 8GB RAM, 128GB SSD, 4K-capable GPU on-chip, for EUR 200. A comment worth noting though: “The only problem is the n100 only has PCIE Gen 3, so I/O is limited” — but apparently the N305 models help with I/O capacity.
(tags: microservers mini-pcs hardware self-hosting home gadgets n100 intel)
Folk wisdom on visual programming
A (lengthy) summary of third party comments on visual programming environments and tools, from Hacker News (via Tony Finch’s retro-links)
(tags: gui hn no-code programming tools coding visual-programming hacker-news via:fanf)
-
Kellan Elliott-McCrea of laughingmeme.org has started a new link blog! In 2024! Of course, as readers of this link blog know, link blogs never went away :)
(tags: link-blogging blogging links)
Clustering ideas with Llamafile
Working through the process of applying a local LLM to idea-clustering and labelling: – map the notes as points in a semantic space using vector embeddings; – apply k-means clustering to group nearby points in space; – map points back to groups of notes, then use a large language model to generate labels. This is interesting; I particularly like the use of local hardware
Ex-Google CEO: AI startups can steal IP, hire lawyers to “clean up the mess”
Ex-Google CEO, VC, and “Licensed arms dealer to the US military” Eric Schmidt:
here’s what I propose each and every one of you do: Say to your LLM the following: “Make me a copy of TikTok, steal all the users, steal all the music, put my preferences in it, produce this program in the next 30 seconds, release it, and in one hour, if it’s not viral, do something different along the same lines.” […] If it took off, then you’d hire a whole bunch of lawyers to go clean the mess up, right? But if nobody uses your product, it doesn’t matter that you stole all the content. And do not quote me.
jfc. Needless to say he also has some theories about ChatGPT eating Google’s lunch because of…. remote working.
(tags: law legal startups ethics eric-schmidt capitalism ip)
Engine Lines: Killing by Numbers
from James Tindall, “This is the Tyranny of the Recommendation Algorithm given kinetic and malevolent flesh” —
Eventually there were days where Israel’s air force had already reduced the previous list of targets to rubble, and the system was not generating new targets that qualified at the current threshold required for residents of Gaza to be predicted as ‘legitimate military targets,’ or ‘sufficiently connected to Hamas.’ Pressure from the chain of command to produce new targets, presumably from a desire to satisfy internal murder targets, meant that the bar at which a Gaza resident would be identified as a legitimate Hamas target was simply lowered. At the lower threshold, the system promptly generated a new list of thousands of targets. At what threshold, from 100 to 1, will the line be drawn, the decision made that the bar can be lowered no more, and the killing stop? Or will the target predictions simply continue while there remain Palestinians to target? Spotify’s next song prediction machine will always predict a next song, no matter how loosely the remaining songs match the target defined by your surveilled activity history. It will never apologise and declare: “Sorry, but there are no remaining songs you will enjoy.”
(tags: algorithms recommendations israel war-crimes genocide gaza palestine targeting)
Digital Apartheid in Gaza: Unjust Content Moderation at the Request of Israel’s Cyber Unit
from the EFF:
Government involvement in content moderation raises serious human rights concerns in every context. Since October 7, social media platforms have been challenged for the unjustified takedowns of pro-Palestinian content—sometimes at the request of the Israeli government—and a simultaneous failure to remove hate speech towards Palestinians. More specifically, social media platforms have worked with the Israeli Cyber Unit—a government office set up to issue takedown requests to platforms—to remove content considered as incitement to violence and terrorism, as well as any promotion of groups widely designated as terrorists. …. Between October 7 and November 14, a total of 9,500 takedown requests were sent from the Israeli authorities to social media platforms, of which 60 percent went to Meta with a reported 94% compliance rate. This is not new. The Cyber Unit has long boasted that its takedown requests result in high compliance rates of up to 90 percent across all social media platforms. They have unfairly targeted Palestinian rights activists, news organizations, and civil society; one such incident prompted Meta’s Oversight Board to recommend that the company “Formalize a transparent process on how it receives and responds to all government requests for content removal, and ensure that they are included in transparency reporting.” When a platform edits its content at the behest of government agencies, it can leave the platform inherently biased in favor of that government’s favored positions. That cooperation gives government agencies outsized influence over content moderation systems for their own political goals—to control public dialogue, suppress dissent, silence political opponents, or blunt social movements. And once such systems are established, it is easy for the government to use the systems to coerce and pressure platforms to moderate speech they may not otherwise have chosen to moderate.
(tags: activism censorship gaza israel meta facebook whatsapp eff palestine transparency moderation bias)
The Soul of Maintaining a New Machine
This is really fascinating stuff, on “communities of practice”, from Stewart Brand:
They ate together every chance they could. They had to. The enormous photocopiers they were responsible for maintaining were so complex, temperamental, and variable between models and upgrades that it was difficult to keep the machines functioning without frequent conversations with their peers about the ever-shifting nuances of repair and care. The core of their operational knowledge was social. That’s the subject of this chapter. It was the mid-1980s. They were the technician teams charged with servicing the Xerox machines that suddenly were providing all of America’s offices with vast quantities of photocopies and frustration. The machines were so large, noisy, and busy that most offices kept them in a separate room. An inquisitive anthropologist discovered that what the technicians did all day with those machines was grotesquely different from what Xerox corporation thought they did, and the divergence was hampering the company unnecessarily. The saga that followed his revelation is worth recounting in detail because of what it shows about the ingenuity of professional maintainers at work in a high-ambiguity environment, the harm caused by an institutionalized wrong theory of their work, and the invincible power of an institutionalized wrong theory to resist change.
(tags: anthropology culture history maintenance repair xerox technicians tech communities-of-practice maintainers ops)
Listen to the whispers: web timing attacks that actually work
Impressively fiendish. Figuring out attacks using 5ms differences in response times
(tags: timing-attacks attacks exploits web http security infosec)
-
“How chat-based Large Language Models replicate the mechanisms of a psychic’s con”:
RLHF models in general are likely to reward responses that sound accurate. As the reward model is likely just another language model, it can’t reward based on facts or anything specific, so it can only reward output that has a tone, style, and structure that’s commonly associated with statements that have been rated as accurate. [….] This is why I think that RLHF has effectively become a reward system that specifically optimises language models for generating validation statements: Forer statements, shotgunning, vanishing negatives, and statistical guesses. In trying to make the LLM sound more human, more confident, and more engaging, but without being able to edit specific details in its output, AI researchers seem to have created a mechanical mentalist. Instead of pretending to read minds through statistically plausible validation statements, it pretends to read and understand your text through statistically plausible validation statements.
(tags: ai chatgpt llms ml psychology cons mind-reading psychics)
-
This is a very tempting mod to add to my Gaggia Classic espresso machine. Although I’d probably need to buy a backup first — my wife might kill me if I managed to break the most important device in the house… “With Gaggiuino, you can exactly control the pressure, temperature, and flow of the shot over the exact duration of the shot, and build that behavior out as a custom profile. One pre-programmed profile attempts to mimic the style of a Londinium R Lever machine. Another creates filter coffee. Yet another preinfuses the basket, allowing the coffee to bloom and maximizing the potential extraction. While other machines do do this (I would be remiss to not mention the Decent Espresso machine, itself an important milestone), they often cost many thousands of dollars and use proprietary technology. Gaggiuino on the other hand, is user installed and much more open.”
(tags: gaggia gaggia-classic espresso coffee hacks gaggiuino mods)