Solving blaming game

Context: stuck with problems, cannot find root cause, dunno and cannot understand and reproduce problem. Maybe feeling very tired for a long day, already taken 5 hours without any good executions. Maybe you don’t want to take responsibility for that.

Introducing blaming: the fine art of making others responsible for all the difficult things that happen to you.

Nowadays, there are many many contexts & examples for blaming. I only want to wide the scope of this mindset, into SRE/Devops/Sysadmin fields. Why we blaming each others, right or wrong? How can we fix/improve our mindset? How can we do better?


Signal of linux

Definition:

  • System calls: communication chanel between user space program and kernel
  • Signals: a different channel, used for inter-process communication
  • Signals don’t carry any agrgument, they are self explanatory by their name
  • Some signals identified by a number, ie SIGKILL (9)
  • That’s why we use kill -9 <PID> to kill a process, because the kill command will send a defined signal to a process with a given identity <PID>
  • when we run kill -9 <PID> command, that process is not terminate itself, instead we’re telling that OS to stop running the program, no matter what the program is doing


Learning logs in Aug

  1. Export large tables in MySQL
  2. Bash completion on debian 9


RDS as-a-fckin-service

AWS RDS document: Depending on the DB instance class and the amount of storage, it can take up to 20 minutes before the new instance is available.

Reality:

  • RDB instance status: creating
  • RDS stuck in this fucking step nearly 2 hours and couting
  • db-fuck-as-a-service


Mindset for building HA and scalable system

System or infrastructure must have

  • Fault tolerance
  • No single point of failure
  • More than one or two security layers
  • Auto-failover without requiring human intervention
  • Heartbeat monitoring on all running components
  • Infrastructure as code