Solving blaming game
Context: stuck with problems, cannot find root cause, dunno and cannot understand and reproduce problem. Maybe feeling very tired for a long day, already taken 5 hours without any good executions. Maybe you don’t want to take responsibility for that.
Introducing blaming: the fine art of making others responsible for all the difficult things that happen to you.
Nowadays, there are many many contexts & examples for blaming. I only want to wide the scope of this mindset, into SRE/Devops/Sysadmin fields. Why we blaming each others, right or wrong? How can we fix/improve our mindset? How can we do better?
Signal of linux
Definition:
- System calls: communication chanel between user space program and kernel
- Signals: a different channel, used for inter-process communication
- Signals don’t carry any agrgument, they are self explanatory by their name
- Some signals identified by a number, ie
SIGKILL
(9) - That’s why we use
kill -9 <PID>
to kill a process, because the kill command will send a defined signal to a process with a given identity<PID>
- when we run
kill -9 <PID>
command, that process is not terminate itself, instead we’re telling that OS to stop running the program, no matter what the program is doing
Learning logs in Aug
- Export large tables in MySQL
- Bash completion on debian 9
RDS as-a-fckin-service
AWS RDS document: Depending on the DB instance class and the amount of storage, it can take up to 20 minutes before the new instance is available.
Reality:
- RDB instance status: creating
- RDS stuck in this fucking step nearly 2 hours and couting
- db-fuck-as-a-service
Mindset for building HA and scalable system
System or infrastructure must have
- Fault tolerance
- No single point of failure
- More than one or two security layers
- Auto-failover without requiring human intervention
- Heartbeat monitoring on all running components
- Infrastructure as code