Personal Principles To Develop Software
I am starting a new chapter in my career. I wanted to write down everything the principles I learned so far to develop software, so I can come back look at them, reflect, change and add to the list.
My Principles to develop software
On Testing
- Write tests. Testing is what adds engineering rigor to develop software.
- Test driven development is wonderful. Start with writing tests for any big APIs/interfaces.
- If you are touching code that already exists, refactoring or adding new functionality and there are no tests, write tests for first for the existing functionality before writing your new code and adding adds new tests for that.
- You don’t need to write tests if there is one person only working on the software. The minute you add another person, everyone needs to start writing tests.
- We test in order to verify assumptions. We verify assumptions about how the code we write works and we verify assumptions about how it interacts with code we didn’t write.
- Writing tests is a great way to solidify your contract (and is a variation of design by contract development).
- Understand the testing pyramid. At its base, are unit tests (tests only testing small-ish components), followed by functional tests (big components by using mocks for dependencies/external elements), followed by integration tests (tests for entire component/service), followed by end-to-end tests (testing multiple and large complex systems together).
- Testing can be a scam. Specifically 100% unit test coverage doesn’t correlate to better code quality. Accepted principle is 80% is good enough (invest rest of time in higher level tests).
- Integration tests can also be a scam. It’s very complex to test all dependencies and all failure scenarios. Mid-level integration testing is the solution (always heavily draws on design by contract principles)
On continuous integration & continuous delivery
- CI & CD are one of the greatest contributions to develop software in last few years. Being able to deploy and test changes instantly is what allowed the exponential growth of software technologies.
- Invest heavily in having build systems build and deploy packages automatically as well as run tests. Automation of full suite of tests should be part of the CI pipeline.
- Invest heavily in deploying packages to multiple environments. Always have a test stage before your production stage. Observe the behavior when you deploy new changes (and invest in roll-back mechanisms). Monitoring is important
- Investment in tooling is what differentiates a fast vs slow moving company. with tooling to build, deploy, test, monitor and manage infrastructure, you can do so much more. try to use on the shelf. don’t reinvent the wheel
On monitoring & observability
- Monitoring allows you to determine if your system is healthy. Observability allows you to determine what went wrong
- Writing software is a small part of the job. Having good metrics, proper alarming and tools to look at logs and figure out what happened in your system are more important than the code or tests your write.
- If you don’t know how your code is running, you are flying blind.
- You need to monitor business metrics…constantly and you need to alarm on them. If you are only monitoring functional metrics but not business metrics, you might be missing how customers are interacting with your software and you are practically flying blind
- Have SLA and targets for your metric. A metric with no SLA is useless. Have measurable metrics and trigger alarm on them.
- My personal rule is to always be able to tell what went wrong in a system just by looking at alarms. If a dependency went down we should be able to tell right away just by looking at metrics.
- Must have dashboard with a graph for every metric that has an alarm for it. I should be able to always figure out what is causing a problem in a system by looking at dashboard for < 5 minutes, this is what observability is all about.
On Metrics
- Sample but never aggregate. Aggregates reduce correctness of data, sampling doesn’t.
- Events not just metrics, events give an idea of everything that happened, not just random statistics
- Explore don’t dashboard, be able to dive and figure out what happened and how it related to everything else
- Event traces, not event strings, must be able to see entire trace of how everything happened in system (and store that). Its important to understand interactions of evenets
- You can never measure everything, it is wasteful and uneconomical.
On developing software, building & complexity
- “The most common misconception/myth about programming is that the way things are done is the best way”
- Over-engineering is bad engineering
- the best engineers simplify their problems
- Knowing how to break a complex problem into smaller simpler ones is key
- Always think of layers, think at appropriate layer and dive deeper as you need
- Complexity in software (just like in engineering or the sciences) is where people spend most of their time. Learning to make what seems complex, simple is key to being a good engineer
- Building is a way of thinking that can lead to success
On customer obsession, experimenting & failure
- If you claim you care about your customer, you must have tools to understand your customer: metrics to track customer behavior, ways to engage with customers, track conversion/frustration/etc
- Sometimes/usually its hard to find what customer really wants, then you must experiment. Google did this with Marissa Myers back in the days to understand if customers liked color blue vs black vs whatever
- Experiment and fail. Fail fast. Make mistakes, learn from them and keep improving and growing. the more you can experiment, the faster you can improve.
- Do the same thing with yourself, try things, experiment, fail and keep trying again
- Never give up. Start ups fail to self suicide more often due to no market need way more than they do for running out of cash
On optimizations
- Without metrics and data it is absolutely stupid to optimize your systems. Always run load test, profiling or look deeply into your business metrics before you decide on what to optimize.
- It is impossible to get 100% correctness/perfection in distributed systems. Always aim to have high SLA for most of your cases.
- Try to invest time in improving what matters. Profile to really see where the system bottlenecks
- For most of the time, readability matters over speed (thats most of the software I have written). Speed matters a lot for a lot of cases, and for those, we still should add readability using comments/proper naming of variables/methods/classes.
- Whatever you optimize for speed, efficiency, cost, always remember to optimize code readability for the person reading the code after you
- cant improve if you cant measure. Efficiency comes from measuring
- Understand your cost
On design & refactoring
- we should always try to design and think before we write code
- With experience and learning, one can start seeing patterns for how to design and write code. design patterns are important (not just OO patterns but patterns for testing, for distributed systems, etc)
- A software design is usually wrong, because it makes assumptions which are usually broken when things get implemented. Therefore know this when you start designing and you start coding.
- Software designs should be as flexible, extensible and modular as the code that creates the software. They should be able to change and should always be changing.
- Software design is a process of hypothesis, doing something and acting on it. We must then STOP. We must finish and consider something done. Then we must look back and evaluate it. We then consider what we are going to do next and we optimize our design, improve our hypothesis and act on it. This is the process of Refactoring.
- Software design is iterative. Refactoring is continuous design
- Even initial implementation of code is always wrong. It’s always limiting and doesn’t foresee optimizations of the future (and usually if it did see optimizations, they are wrong because they are premature/made up optimizations).
- We should always refactor and extend our designs and we should always refactor and extend our code; BUT only as the need arises.
- Never refactor code because it just looks ugly. Refactor code because you are going to add something to it, because you are extending it for some immediate future use. If you refactor for sake of refactoring, you are wasting your time.
- Never rewrite code (especially without tests to validate your changes). It is always always always better to refactor in place than to rewrite code. Code carries with it, the thoughts of those who came before us and wrote it. It carries hidden edge cases and fixed bugs. Every time you decide to rewrite, you lost all the knowledge inside the old code (despite how ugly it is or how much you hate language X). Rewriting should be for extreme cases only.
- We must finish and work in increments in order to stop and reflect; and improve therefore (our learning and understanding as well as our code and design)
- Establishing a domain design (from domain driven design) is one of the fundamental ways to improve how entire teams and organizations build software. This is often overlooked but is truly the hidden secret to well architected software (BELIEVE ME IT IS NOT THE TECH USED UNDERNEATH). How everyone on team communicates is HUGE in determining success of code and business
On documentation
- The problem with documentation is it becomes outdated really fast. A design document is usually outdated, the minute we start implementing it.
- The code itself is the best documentation of the system but that is overly complex for someone to be able to understand in order to understand system architecture, so we must put effort in documentation.
- We must try to keep documentation of whats working at a high level (at a low level documentation is usually useless), this is important if we have a lot of teams/a lot of people working on systems
- Documentation is very important for APIs and Libraries. It should always be updated
- Documentation should live with the code. That’s usually how it lives with libraries and APIs but it should be how it is built for large scale systems
On technical debt
- Everything carries technical debit. Code always has its limitations.
- As a developer once said, ‘code should have an expiry date’
- Always try to document and understand the limitations of what you write
- Verify your assumptions using testing
- The simplest solution is usually the best design “simplest solution with loosest guarantees that is practical”
- wanting to try out a new technology is always technical debt. There is always the debt of learning, the debt of hiring people who don’t know the technology and time to onboard them, the debt of finding tools and integrations.
- Despite how shiny new things are, old things are sometimes better because they are well supported, more secure, etc. We should invest in new technologies and let them grow but we should do that knowing the risk.
- If you have to make changes in multiple systems to add a feature, your system is bad. This includes multiple classes in same service (mildly bad). It is extremely terrible when software systems require changes in multiple services to enable a new feature. In a distributed system this is the biggest code smell and the biggest thing you should fix. Loose couple microservices, otherwise you just built a big distributed mess.
On scalability
- in distributed systems, vertical scaling is burdensome and expensive, horizontal scaling is the way to go. Horizontal scaling is HARD.
- Coordination is what usually kills things and reduces performance. Building “stateless” software reduces complexity.
- Simple solutions are usually the most scalable
- Depend on ‘infinitely scalable’ technologies when you can if you are building applications. This means using S3, DynamoDB and to some extend things like SQS.
- Event driven software can be better than synchronous software because it enforces loose coupling. Generally using a messaging broker like SQS or Apache MQ allows work to continue at a later time, which is cost effective. (event driven might be harder to debug in some cases though!)
- testing the scalability of a system is one of the biggest challenges of distributed systems. The important thing is to always try to test the assumption of how scalable your system is using actual tests.
- test other failure scenarios besides high load, test dependency failures, network failures and infrastructure going down.
- Serverless is the future. Kubertenes/containers is to serverless as virtualization is to cloud computing. Virtualization lost and nobody builds datacenters any more (besides dropbox and a few others). Serverless is the future and I would recommend putting effort into using and building serverless than using containers (if you can).
- Don’t go building your own container orchestration whatever. the winner here is obvious. use Faragate by AWS. Seriously its’ amazing
- Serverless should infinitly scales, but we should test it, invest in it and think of ways to make it better (Think: serverless elasticsearch)
- Best ways to scale is: divide and conquer & approximate correctness, add jitter
- You have to give up durability or consistency sometimes. That’s okay, think about business logic. Sometimes you can add latency and customer won’t notice or care
- Cheapest function call is the one that never happens. Remove dependencies you don’t need. Simplify your system
- Adaptive, reactive, serverless seem like the way to go.
On Software topics
- Caching: Caching is hard, its as hard as naming variables. The two hardest problems in computer science. Add jitter, especially for caching. We usually don’t want all cache entries to expire and try to retrieve data at the same time.
- On serialization: Serialization will cause bugs, always test. Be careful if you change serialization libraries. serialization is costly. Observe if it causes a bottleneck for your latency
- AWS: is bomb
- Serverless is even better, especially with Java
- On JS frameworks: seriously its overwhelming, someone tell them to stop creating frameworks!!!
On security
- Security is important, though still usually overlooked
- Software must be part of the language you use. This is why Java sometimes is very popular because it is known to be secure
- To develop software means it scales and never goes down, as well as never compromises data of users. Both are important
- Every software developer must understand the principles of software security in order to be a good engineer. It takes time but is a very valuable skill
On Agile Development
- Sprint points and estimation usually measure how good we can estimate or how good we can report to managers. They don’t increase to develop software efficiency.
- Agile processes should be improving efficiency on how team runs (with just little sacrifices for proper tracking).
- I believe sprint planning should be lightweight and easy and focus on getting priorities for next sprint. don’t believe in estimation at sprint level. Optimize for utilization/productivity of the developers.
- Estimation is useful in t-shirt sixes for projects, but estimations are always wrong. estimations are only useful in comparison to each other
- Pair-programming where two people sit on same screen is usually wasteful. It doesn’t allow engineers to think and then discuss and then code. I don’t like the idea of the two, it’s limiting. Ideas should be triangulated.
- Working together tightly with other people, sitting closely and bouncing off ideas all the time is the best way to code.
- Coordination and communication is always the biggest overhead
On team culture
- I believe in philosophy over doctrine (we should strive to test when it makes sense, rather than we must always tell)
- Conventions are good, they especially help establish a base for everyone both junior and senior on the team
- Enforce rules using automation: check-style, code coverage etc. I think it’s absolutely useless and wasteful when someone makes a style comment on code
- code reviews are generally good (for growth of junior developers especially). They are pointless sometimes and I hate when they block meaningful work and progress.
- The biggest blocker internally within a team are usually code reviews. Always build a system in team to reduce the blockage of code reviews.
- give everyone responsibility, diffuse responsibility. Everyone must try everything. nobody should be doing the same thing over and over (since that concentrates knowledge and limits its expansion)
- Be humble. Listen more, talk less. Ask questions, talk less.
- Wise teams have a diversity of opinions and independence
- There is no one-fit-all developer, always find team fit, product fit and a vision fit
- Organization matters. Having a vision matters. Having a manager who believes in you matter.
On mentoring and growth
- Create guidelines for mentorship
- Set specific goals
- Allow people opportunity to grow and make mistakes. Sometimes its hard to see people struggle but we must let people develop their own ways and figure things out in order to grow (because usually that’s how we ourselves learned also).
- Giving the answer right away is not good, people might think differently, the situation might be different and people might come up with better ideas (instead of go use X caching technology, allow person to investigate and figure out the best solution themselves).
Things I want to do more this year:
- How to design and build simpler software for complex problems
- Ants: Don’t build software that require leaders. ants don’t have leaders but they coordinate with each other just by passing knowledge in surrounding (ants around them). There is no global communication of ants. Messages are broadcast one by one. Ants are simple and that’s the way
- Use Akka for writing actor model software and how it can be used in distributed systems
- Learn Go and Koltin for Java
- Running better chaos engineering: resilence
Do you have any ideas for things you learned/think are important
1 Response
[…] I will write more about this idea of standards in future posts. I already wrote about some personal guides for writing software previously here […]