Dear Sentinels
This week, we have a full story of spyware to consider, and humans in the loop when it comes to AI outperforming the standard cohort. This makes me think of the new robot released for homes, but only if you agree to the fine print: you pay upfront, receive the robot only when it's ready, and when it's prepared, that doesn't mean all the features are ready. Actually, they will do the entire thing with a human in the loop! Here is the video on that if you're interested.
First of all, we will view the story of spyware, which is not an intrusive foreign agent coming to collect our data, but actually Microsoft itself and its push for AI first. This has to do with Copilot+ PCs and the Microsoft Recall feature, as explained below.
The Simplest Way to Create and Launch AI Agents and Apps
You know that AI can help you automate your work, but you just don't know how to get started.
With Lindy, you can build AI agents and apps in minutes simply by describing what you want in plain English.
→ "Create a booking platform for my business."
→ "Automate my sales outreach."
→ "Create a weekly summary about each employee's performance and send it as an email."
From inbound lead qualification to AI-powered customer support and full-blown apps, Lindy has hundreds of agents that are ready to work for you 24/7/365.
Stop doing repetitive tasks manually. Let Lindy automate workflows, save time, and grow your business
News from around the web
Comprehensive Risk Assessment: Microsoft Recall Feature
Introduction to Microsoft Recall and Copilot+ PCs
Microsoft Recall is an artificial intelligence application designed to create a comprehensive, searchable history of a user's activity by systematically taking screenshots of their screen. Its stated purpose is to allow users to find anything they have previously seen or done on their computer. The feature's rollout has been tumultuous. Introduced in 2024, it was met with immediate and severe criticism after researchers discovered that its database was stored in plain text. In response, Microsoft withdrew the feature before re-introducing a "more secure" version. Despite these changes and its "preview" status, the user onboarding process on new Copilot+ PCs actively encourages users to enable Recall during the initial Windows setup experience.
Analysis of Security Vulnerabilities and Exploitation Vectors
For a feature that systematically captures and stores screenshots of user activity, the implementation of robust data protection and stringent access controls is paramount. Any failure in these areas transforms a tool of convenience into a significant security liability. This section deconstructs the specific technical vulnerabilities and exploitation vectors identified through independent security testing.
Failures of Sensitive Information Filtering
Recall includes a "Filter sensitive information" setting that is enabled by default, with the goal of preventing the capture of personal and financial data. However, testing reveals this filter is unreliable and fails in numerous common scenarios, creating a centralized repository of sensitive information.
Financial Data: The filter successfully excluded account and routing numbers on a banking website but captured the bank's homepage, account balances, and lists of deposits. In another test, while a standard credit card form was correctly filtered, a custom form was fully captured, including card number, expiration date, and CVC.
Passwords and Credentials: The filter demonstrated mixed results. It successfully ignored Google Chrome's password manager but captured a full list of usernames and passwords stored in a simple Notepad file that lacked identifying labels like "username." Furthermore, during a login attempt on PayPal, the user's username was captured on the login screen.
These widespread failures indicate that the filter's reliance on contextual text cues is architecturally flawed and cannot be trusted to reliably redact sensitive data from unstructured screen captures.
Weaknesses in User Authentication and Access Control
Microsoft states that viewing or searching Recall snapshots requires authentication via Windows Hello. However, this control is fatally undermined by its support for PIN codes as a fallback method. This implementation reduces the entire security model of Recall to its weakest link, a low-entropy PIN. Independent tests demonstrated that this vulnerability is easily exploitable. Common remote desktop protocol software like TeamViewer and VNC seamlessly present the local PIN prompt to a remote attacker.
Undermining Existing User Privacy Tools
Recall's system-level implementation actively undermines privacy-centric features that users rely on within other applications. This creates a two-pronged failure of user agency and data security. First, proactive privacy measures are rendered ineffective; for example, Brave browser's "Off-the-Record" and "Forgetful Browsing" features are nullified because Recall captures screenshots of user activity before the browser's controls can take effect. Second, the entire history captured by this process is then exposed to remote compromise via a simple PIN. This means a user's explicit intent to keep their activity private is not only ignored by the operating system, but the resulting data is then left vulnerable to remote access.
Mitigations and Controls
In response to initial security criticisms, Microsoft announced a series of enhancements to protect the data captured by Recall. These measures were intended to address the most severe vulnerabilities and assure users of the feature's safety. This section critically evaluates the effectiveness and practical limitations of these official mitigations against the findings of independent analysis.
The following table contrasts Microsoft's stated security measures with the limitations and counterarguments identified during testing.
Stated Mitigation | Identified Limitations & Counterarguments |
Windows Hello Authentication | Access is not limited to biometrics; PIN codes are a supported fallback, which was successfully used to gain full remote access via TeamViewer and VNC, reducing the security model to its weakest link. |
Filter sensitive information | The filter is enabled by default but has been proven to fail in numerous test cases, capturing credit cards, passwords, and PII. Microsoft acknowledges it is not perfect. |
Preview Status | Despite being labeled a "preview," the feature is actively pushed to users during the standard Windows Out-of-Box Experience (OOBE) on new PCs. |
Application/Website Blacklisting | This control is reactive, placing an impossible burden of constant vigilance on the user. It is fundamentally inadequate for corporate environments where sensitive data can appear in unexpected contexts, such as a CRM screenshot in a chat client. |
This analysis reveals a significant gap between the theoretical security promised by Microsoft's mitigations and the practical realities observed during independent testing.
Primary Security Risks
Inconsistent Sensitive Data Filtering: The feature's automated filter is architecturally flawed and unreliably captures highly sensitive credentials, financial data, and Personally Identifiable Information (PII), creating a centralised "treasure trove for crooks."
Weak Access Control: Remote and local access to the entire Recall history is possible using only a low-entropy PIN, a fallback mechanism that completely bypasses biometric security measures and exposes the database to common remote access attacks.
Potential Platform Vulnerabilities: The underlying VBS encryption is not guaranteed to be immune to sophisticated side-channel attacks, making its security dependent on perfect and persistent system patching.
Primary Privacy Risks
Threat to Vulnerable Users: The feature creates a tool that can be easily misused for interpersonal surveillance, posing a significant danger to individuals like domestic violence victims by exposing their private activities and attempts to seek help.
Erosion of User Control: Recall systematically circumvents established privacy tools within applications, capturing data that users intend to keep private and rendering their choices and control meaningless.
Ultimately, while the feature "may" have been developed with good intentions, its implementation carries fundamental flaws that present an unacceptable level of risk. So, whether you opt for an outdated system, or a Mac, or, as I have done, switch to Linux, these are your only real options if you can no longer trust Microsoft with your data.
Summary
The paper introduces InstructGPT, a model fine-tuned using reinforcement learning from human feedback (RLHF) to align large language models with diverse user intent. Outputs from the smaller 1.3B InstructGPT model were significantly preferred over the 175B GPT-3, demonstrating superior helpfulness, truthfulness, and reduced toxicity.

Background
While large language models (LLMs) can perform many NLP tasks, they frequently generate unintended outputs, such as making up facts or producing toxic content, indicating a lack of alignment with user intent. The core issue is that the standard objective of predicting the next token is misaligned with the goal of following user instructions helpfully and safely. Averting these negative behaviours is critical for LLMs deployed across numerous applications. The research aims to align LMs so they are helpful, honest, and harmless, encompassing explicit and implicit user intentions.
Use-case
The InstructGPT models were developed by fine-tuning on a broad distribution of tasks derived primarily from prompts submitted to the OpenAI API Playground. These real-world use cases encompass diverse natural language tasks, including generation, question answering, summarisation, extraction, and dialogue. Analysis of the API prompt dataset reveals that the most common categories are open-ended generation, open question answering, and brainstorming. Therefore, the primary application is improving performance and reliability for a wide range of user-driven generative and instructional tasks in a deployed assistant setting.

Future Work
Despite improvements, InstructGPT models still exhibit limitations, such as failing to follow instructions, making up facts, and generating potentially harmful content when explicitly instructed to do so. Future work should focus on further reducing toxic and biased outputs, potentially by implementing adversarial data collection or combining RLHF with pretraining data filtering methods. Additionally, research is needed to train models to refuse potentially harmful or dishonest user instructions, which requires determining context-dependent refusal mechanisms. Researchers must also continue to study how to eliminate remaining performance regressions associated with the alignment tax.
You can download the article here.


