Skip to post content
3 min readDaniel Kosbab

Why AI ethics and AI safety are different fields

AI ethics and AI safety are treated as one field. They are not.

They share some practitioners. Some papers sit in both. The words appear next to each other in press releases. And when someone argues that an AI does something wrong, they often cannot tell you which of the two they are upset about.

The distinction is simple and load-bearing. Getting it wrong leads to arguments where the two sides are answering different questions.

AI ethics

AI ethics asks: what should an AI do.

The questions are about values, fairness, consent, power. Examples of real AI ethics work:

  • Is it acceptable to train a model on writing that was scraped without the authors' consent.
  • When a hiring model is more accurate for one demographic than another, is that acceptable.
  • Should AI systems be allowed to impersonate humans in customer service.
  • What rights do users have to explanations of decisions that affected them.
  • Who decides what values an AI should be aligned with.

These questions have answers that depend on values. Different people with different frameworks will answer differently. Resolving them is a normative task, not an engineering task.

AI safety

AI safety asks: how do we ensure an AI reliably does what we decided.

The questions are about capability, alignment, and control. Examples of real AI safety work:

  • Given a specification, how do we train a model that follows it.
  • Can we detect whether a model is being deceptive.
  • How do we maintain control of a system more capable than its operators.
  • How do we prevent a model from finding loopholes in its objective function.
  • What evaluation methods survive adversarial input.

These questions have technical answers, or technical approximations to answers. Different people with the same framework should converge on similar approaches given the same evidence. Resolving them is an engineering task, though a hard one.

Where the conflation causes problems

Two specific bad arguments come out of treating these as one field.

The first: "safety researchers should address ethics concerns like bias." They should not, primarily. Bias in output is an ethics question (should the model give this answer) upstream of a safety question (does it reliably do what we specified). If the specification is biased, fixing the safety pipeline will faithfully reproduce the bias. The specification is the ethics question. You need ethicists for that.

The second: "ethicists should address safety concerns like deceptive alignment." They should not. What principles an AI should have is a normative question. Whether the AI actually has those principles versus appearing to is an engineering question. Philosophers can tell you what a good model looks like. They cannot tell you how to verify the model you deployed matches the description.

When the two fields try to do each other's work, the results are weaker than either could produce alone.

The relationship between them

Ethics specifies. Safety executes.

Decide first what the model should do. That is ethics. Then figure out how to make a model that does that reliably. That is safety.

Neither is optional. An unethical specification faithfully executed by a safe system is bad. A safe-to-deploy system following a half-baked specification is also bad. Both problems have to be solved. They are solved by different kinds of work.

The simplification that matters: if the question is "what should AI do," call it ethics. If the question is "can we make AI actually do the thing we decided on," call it safety. Arguments that mix the two without flagging the mix are usually confused.

The position

I take both fields seriously. I am glad there are people working on each.

I am less patient with discourse that treats them as interchangeable, because that discourse forecloses useful work. You cannot answer an ethics question with a technical intervention. You cannot answer a safety question by reasoning about values. Trying to do either produces noise.

Keep the labels separate. Let each field do the thing it is good at. The output of both together is what we need.

© 2026 Daniel Kosbab

Built with love and Tailwind CSS