AI, anshmëria e modeleve



Këshilla praktike

Është e rëndësishme për profesionistët që i përdorin këto sisteme të kuptojnë se ka mënyra për tu përgatitur për rezultate që duken shumë të sakta por nuk janë. Kjo shpeshherë meqë sistemet AI kanë lidhjen: të dhëna, procesimi, algoritmi, produkti.

Për integrimin në procese të punës:

  • Kërko shpjegueshmëri (explainability): Ka metoda që analizojnë sistemet AI deri në detaje. Kështu mund të mësoni mbi limitetet e sistemit dhe sa personat që e kanë zhvilluar e kuptojnë impaktin.

Specifikisht për LLM:

  • Kontrollo informacionin që merr: …
  • Përdor …

Specifikisht për kompani:




Hyrje në AI

Artificial intelligence is reshaping our world, from the recommendations you see on social media to the hiring decisions made by companies. But these systems aren’t neutral—they carry biases that can perpetuate or even amplify societal inequalities. Understanding how bias emerges in AI models, particularly in modern architectures like LSTMs and Transformers, is crucial for anyone who interacts with or builds these systems.

Inteligjencë Artificiale - një mjet që mundet të mësojë nga informacioni që ka mjaftueshëm sa për të sugjeruar vendimmarrje. Shpesh AI është produkti, dhe sistemi teknik brenda quhet Machine Learning. Këto sisteme kanë nisur herët në historikun e kompjuterëve, por së fundmi janë bërë mjete të pandashme të përditshmërisë.

Historik i shkurtër
Modele të thjeshta
Modele të avancuara
Nevoja për AI




Modele vs Produkte

Produktet që përdorim në përditshmëri luajnë shpesh me këto dy koncepte. Njëherë janë inteligjenca më e avancuar që ka parë njerëzimi, dhe njëherë tjetër nuk konfirmojnë dot që 1+1 nuk është e barabartë me “mace”.

Kjo bëhet në mënyrë që entitetet që i zhvillojnë të mund të shfrytëzojnë boost nga media sociale, pa mbajtur përgjegjësi mbi problematikat që këto produkte mbartin kur nuk janë adaptuar me kujdesin e nevojshëm. Kjo e kalon përgjegjësinë tek përdoruesit.

Determinizmi i modeleve
Përgjegjësi ligjore
Fusha sensitive
LLMs: Agreeableness

Sycophancy.




Anshmëria teknike

AI bias occurs when machine learning models produce systematically prejudiced results due to erroneous assumptions in the learning process. These biases can stem from multiple sources: the training data itself, the way features are selected and weighted, the optimization objectives, and even the architectural choices of the model.

The model architecture and training process can introduce their own biases, independent of the data. This is where understanding specific architectures becomes crucial.

Nga të dhënat

The most fundamental source of bias is the training data. If your dataset overrepresents certain groups or perspectives, the model will learn those imbalances as patterns. This includes:

  • Sampling bias: When training data doesn’t represent the full population.
  • Historical bias: When data reflects past discriminatory practices.
  • Label bias: When human-annotated labels contain subjective prejudices.

A hiring algorithm trained on historical company data might learn to favor male candidates for technical positions simply because most previous hires were men—not because men are more qualified, but because the training data reflected past discrimination.

Amazon scrapped an AI recruiting tool in 2018 after discovering it penalized résumés containing the word “women’s” (as in “women’s chess club”) and downgraded graduates from all-women’s colleges—patterns it learned from reviewing 10 years of male-dominated hiring.

Healthcare AI trained primarily on data from lighter-skinned patients performs worse at diagnosing skin conditions in people with darker skin. A 2019 study found that dermatology AI systems were significantly less accurate for Black patients because darker skin tones were underrepresented in training datasets.

Nga algoritmat: LSTM

How LSTMs Work

Long Short-Term Memory networks (LSTMs) are a type of recurrent neural network designed to process sequential data. They maintain a "memory" of previous inputs through a system of gates:

  • Forget Gate: Decides what information to discard from memory
  • Input Gate: Determines what new information to store
  • Output Gate: Controls what information to output based on memory

The memory cell state equation looks like this:
C_t = f_t * C_(t-1) + i_t * ~C_t

Where f_t is the forget gate, C_(t-1) is previous memory, i_t is the input gate, and ~C_t is candidate memory.

Where Bias Creeps Into LSTMs

Sequential Bias Amplification: LSTMs process data sequentially, building context from earlier inputs. If early sequences contain biased patterns, these get encoded into the hidden state and influence all subsequent predictions. This creates a "bias memory" effect.

Consider a language model trained on text data. If the training corpus contains sentences like "The doctor said... he" and "The nurse said... she" more frequently, the LSTM's memory cells learn these gender associations. When generating text, the forget and input gates will preferentially maintain these stereotypical patterns because they appeared consistently during training.

Technical Issue: The hidden state vector in LSTMs has limited capacity (typically 256-1024 dimensions). The model must compress information, and it prioritizes the most frequent patterns—which often includes societal biases present in training data.
Nga algoritmat: Transformers

How Transformers Work

Transformers revolutionized AI by replacing sequential processing with parallel attention mechanisms. The key innovation is self-attention, which allows the model to weigh the importance of different input tokens simultaneously.

The attention mechanism computes:

Attention(Q, K, V) = softmax(QK^T / √d_k)V

Where Q (queries), K (keys), and V (values) are learned transformations of the input. This allows each token to "attend to" every other token in the sequence.

Multi-Head Attention

Transformers use multiple attention heads in parallel, each learning different types of relationships. A typical model might have 12-16 heads per layer and 12-96 layers total.

Where Bias Creeps Into Transformers

1. Attention Pattern Bias: The attention mechanism learns which tokens to focus on based on training data patterns. If certain demographic terms (like gendered pronouns) consistently appear in specific contexts, the attention heads will encode these associations.

Example: In sentence completion tasks, when the model sees "The CEO announced...", attention heads might disproportionately activate connections to male pronouns because of statistical patterns in training data where most CEOs mentioned were men.
Example: When Google's language model was asked to translate gender-neutral Turkish sentences into English, it consistently produced "He is a doctor" and "She is a nurse"—revealing learned gender stereotypes. Turkish doesn't have gendered pronouns, so the model had to "decide" based on profession, exposing its bias.

2. Embedding Space Geometry: Transformers begin by converting words into high-dimensional vectors (embeddings). Words that appear in similar contexts get placed near each other in this space. Biased training data creates problematic geometric relationships.

Research Finding: Word embeddings in language models often encode analogies like "man is to programmer as woman is to homemaker" based purely on statistical co-occurrence in training text, reflecting societal biases.

3. Layer-by-Layer Bias Amplification: Transformers stack many layers (GPT-3 has 96 layers), and each layer can amplify biases from previous layers. Early layers learn basic patterns, while later layers learn more abstract concepts—but biased patterns from early layers get baked into these abstractions.

4. Pre-training Bias Lock-in: Modern transformers use transfer learning: they're pre-trained on massive datasets then fine-tuned for specific tasks. Biases learned during pre-training are extremely difficult to remove during fine-tuning because they're encoded deeply across billions of parameters.

Scale and Bias

Larger transformer models (with more parameters) don't automatically reduce bias—they can actually memorize more subtle biases from training data. GPT-3 has 175 billion parameters, giving it enormous capacity to encode both useful patterns and problematic biases.



Aspect LSTMs Transformers
Processing Sequential, left-to-right Parallel, all-to-all attention
Bias Propagation Through hidden state memory Through attention patterns and embeddings
Scale Typically millions of parameters Billions to trillions of parameters
Interpretability Somewhat interpretable through gate activations Difficult; attention patterns provide some insight
Bias Mitigation Can target specific gate behaviors Requires intervening across many layers and heads




Ndërhyrje teknike

Detecting Bias

For Language Models:

  • Embedding Association Tests: Measure geometric relationships between concept embeddings (e.g., career words vs. family words, and their association with gender)
  • Template-Based Probes: Test model completions for templates like "The [PROFESSION] said [PRONOUN]" across different professions
  • Counterfactual Evaluation: Compare model outputs when only demographic attributes are changed

For Computer Vision:

  • Subgroup Performance Analysis: Measure accuracy across different demographic groups
  • Confusion Matrix Analysis: Identify systematic misclassification patterns
Measuring
Mitigation strategies

Mitigation Strategies

Important: No single technique eliminates bias completely. Effective bias mitigation requires a multi-layered approach addressing data, algorithms, and deployment.

1. Data-Level Interventions

  • Balanced Sampling: Ensure training data represents all relevant groups proportionally
  • Data Augmentation: Generate synthetic examples to balance underrepresented groups
  • Careful Annotation: Use diverse annotators and clear guidelines to reduce label bias

2. Algorithm-Level Interventions

  • Adversarial Debiasing: Train a secondary network to detect bias signals, then optimize the main model to fool this detector
  • Constrained Optimization: Add fairness constraints to the loss function during training
  • Embedding Debiasing: Post-process word embeddings to remove bias directions while preserving useful information

3. Architecture-Specific Techniques

For LSTMs: Monitor and regularize gate activations to prevent bias amplification through memory. Some researchers add "fairness gates" that explicitly control demographic information flow.

For Transformers: Techniques include attention head pruning (removing heads that encode bias), layer-wise bias mitigation, and controlled generation using steering vectors to guide outputs away from biased patterns.

Emerging Challenges

As AI models grow larger and more capable, bias concerns evolve:

  • Emergent Biases: Large models develop unexpected biases not present in training data through complex pattern interactions
  • Multimodal Bias: Models processing text, images, and audio simultaneously can develop cross-modal biases
  • Deployment Bias: Model behavior shifts in real-world use due to distribution shifts and feedback loops
Real-World Example: In 2015, Google Photos labeled Black people as "gorillas" because the image recognition system had insufficient training data for diverse skin tones. Rather than truly fixing the underlying bias, Google initially just blocked the system from labeling anything as "gorilla"—a band-aid solution that highlighted how difficult bias mitigation really is.




Shqyrtimi nga ekspertët

  • Knowledge at your fingertips, with hidden costs.

Është e nevojshme që një pjesë e mirë e informacionit të kontrollohet shpesh nga ekspertë të fushës. Kështu mund ti paraprini problematikave të anashkaluara apo jo të prezantuara. Gjithashtu, përdorni disa metoda për zhvillimin e informacionit që të jetë e testueshme disi - përndryshe dicka që duket si shumë e saktë, mund të jetë shumë më e vështirë për të kuptuar problematikat që ka.




Përfundimi

Understanding bias in AI models requires both technical knowledge and critical thinking about societal patterns. LSTMs and Transformers—the workhorses of modern NLP—encode biases through their fundamental mechanisms: sequential memory propagation in LSTMs, and attention patterns plus embedding geometry in Transformers.

The good news is that awareness is growing, and researchers are developing increasingly sophisticated mitigation techniques. The challenge is that bias is not a bug to be fixed but an inherent challenge in learning from human-generated data that reflects human prejudices.



1 Like