Bias to the server-side

Problem statement

I was recently working on a support issue, which had client- and server-side aspects. Complicating the issue, we only had partial visibility into server-side health and no visilibity into client-side health. It was hard to even tell where to start investigating. We were also working closely with a partner, who could give us some visilbility, but with high coordination cost.

One approach was to create client visiblity for us and the partner, but this would take time to roll out, didn’t immediately reduce the coordination cost and risked fatiguing the partner.

An alternative approach was to increase the server-side visiblity. We took this approach because we could start investigating immediately (no coordination cost or roll out latency). We might even be able to resolve the issue without requiring the partner to do any work. Also, having more visibility and confidence in the server-side would help if/when we do need to make client-side changes.

Solution

So, my takeaway is simple: when faced with a choice between server- and client-side options, bias toward server-side.

From a product perspective, any work required of customers is friction to adoption. From a technical perspective, it’s much easier to change servers than clients.

The value of a dashboard

Problem statement

A project I’m familiar with recently had a series of issues. Each issue was investigated somewhat independently. It was hard to share common code, share data across roles and track progress over time.

Solution

  1. Capture canonical queries in version control
  2. Periodically run queries, persist and visualize output (aka ETL)
  3. At a higher level, invest in tooling to facilitate such dashboard creation

The end result is much more awareness of the underlying data. Folks in different roles can see the data and ask questions, which often improves the quality of analysis. For example, we now review the dashboard weekly and look for changes as we roll out fixes. Because we now have a pipeline, we can also run different data sources through it to check the analysis.

Resume guidance

I’ve recently been reviewing student resumes as part of a university recruiting program and a few best-practices stand out.

Google has a good video on resume formatting that describes most of these practices. In these cases, I mention a time in the video.

Ok. Here are the best-practices:

  • Recommended resume format at ~2:30
  • Resumes are read in the context of a job description (example). Companies hiring for a job read a resume to see if you have the training and/or experience to do that job, so
    • You can briefly describe what the project was, but invest most space describing what you did and how you did it. Google recommends a phrasing at ~6:00: “Accomplished [x] as measured by [y] doing [z]”
    • Bold key words to highlight your toolkit (~5:25), eg “backend engineer”, “python”, etc, so the reader can match the resume to the role at a glance.
  • It takes a lot of effort to tune a resume for all applications, but may be worth it for the 1-2 jobs you really want
  • Ideally, your resume tells a story, eg some Python in year 1, more Python in year 2, created a service using Python in year 3 –> this person has experience in Python and is using it to start exploring service eng
  • Submitting a resume may feel like an impersonal process, but it’s just people on the other side looking for new teammates. That’ll be you after you’re hired 🙂 The points above try to make it easy for that person to see you’re a great fit for the job.

Squads

This is an organizational pattern I like:

  • 2-5 ppl
  • Cross-functional
  • Focused on a specific goal
  • Weekly demo to squad

I’ve heard this refered to as a “squad”, “swarm”, “e team” and “feature team”.

One of the nice things is the sense of comraderie from working closely with a small group on a specific goal. Another nice thing is the group dissolves after the goal is accomplished, giving the members closure and chances to try new things without switching teams. Another benefit is broad awareness of how a system works.

This pattern works well in a larger context:

  • Shared code ownership
  • 10-50 ppl
  • Focused on a specific, but larger goal
  • Fortnightly demos from all squads
  • Shared calendar for all squad coordination meetings

When a squad accomplishes its goal, the members dissolve into the larger group. Individuals can learn about work in the larger group by attending the fortnightly demos and/or sitting in on other squads’ coordination meetings.

For example, a product may be supported by several teams. To avoid exposing the org chart in the product, make a large team who’s goal is to make that product excellent. Define a squad for each significant work item. All members of the large group are free to contribute code to the product.

An underlying principle is alignment around customer value over specific products or features. Rather than a group of people owning a code base in perpetuity, regardless of the amount of work required, squads form in response to need.

A counter-example would be several teams supporting a product and that product having disjoint features. Another counter-example would be a large team trying to maintain the interest of all its members in weekly coordination meetings. Yet another counter-example would be lots of individuals working in isolation, esp if they’re doing so to avoid coordination co

GitHub Pages redirection

Solution

  1. Install the jekyll-redirect-from plugin
  2. update the front matter on posts to include redirect_to: https://new.site/new/path
  3. Load a post and observe the browser redirect to the new location 👍

Problem statement

I was using GitHub Pages (Jekyll) for blogging, but recently switched to WordPress. I didn’t want to break old links, so I needed a way to permanently redirect.

An SO answer got me thinking about a meta tag. Is there an efficient way to add this meta tag to posts? Yes. Some old GitHub Enterprise Pages docs recommend using the jekyll-redirect-from plugin. We can confirm it’s supported for non-Enterprise Pages by looking at the list of supported plugins. And I see it works via a meta tag.

Is the redirect permanent? Sort of. The HTTP response from GitHub is 200, but the HTML redirect includes a canonical link, eg:

$ curl -v https://erikeldridge.com/notes/google-cloud-workstation.html
< HTTP/2 200 
< server: GitHub.com
< content-type: text/html; charset=utf-8
...
<!DOCTYPE html>
<html lang="en-US">
  <meta charset="utf-8">
  <title>Redirecting…</title>
  <link rel="canonical" href="https://blog.erikeldridge.com/2019/03/02/google-cloud-workstation/">
  <script>location="https://blog.erikeldridge.com/2019/03/02/google-cloud-workstation/"</script>
  <meta http-equiv="refresh" content="0; url=https://blog.erikeldridge.com/2019/03/02/google-cloud-workstation/">
  <meta name="robots" content="noindex">
  <h1>Redirecting...</h1>
  <a href="https://blog.erikeldridge.com/2019/03/02/google-cloud-workstation/">Click here if you are not redirected.</a>
</html>

That script tag looks broken, but it’s shorthand for `window.location.href`.

PlayFab’s LiveOps guide

My experience is largely in features and infrastructure for growing and retaining users, aka “growth”. Recently, I learned the games industry has a comparable concept “LiveOps”. I’ve found value in using the latter to learn more about the space in general.

PlayFab has an excellent guide to LiveOps. The guide is a brief and accessible reference, so I’ll just jot notes below.

Introduction

The guide summarizes LiveOps as “Games are shifting from one-off experiences to services that evolve. Developers of successful live games focus on understanding their players, meeting their individual needs, and cultivating long-term relationships”

In growth-speak, I’d phrase this as analytics, personalization and retention. There is some direct association with growth: “We’re investing in games that people play for longer and engage with much more deeply … to drive growth …“

I guess the “live” in LiveOps refers to “services that evolve”: “… live services

represented nearly 58% of Electronic Arts’ net revenue …” I can see how this would be a big shift from encoding all logic in a released binary. “save client updates for entirely new features or large assets”

“With a LiveOps game, the real work starts with launch instead of ending there” I think there’s less of a distinction in a non-game app; most apps already pull content from a network.

A summary of LiveOps features:

  • “server-side configurations …”
  • “… content data untethered from client versions …”
  • “… in-depth analytics”

“Content data” refers to “… new experiences and content, thereby extending the lifetime of our games“, which explains the claim that LiveOps can reduce up-front investment.

“… the ‘live’ part of LiveOps goes through three post-launch stages:

  1. Iterating your Game …
  2. Personalizing the Player Experience …
  3. Managing the Community …”

I think all of these apply to apps in general.

I like how the breakdown also indicates infra and talent required in each step:

  1. Iteration requires “build, test-and-deploy pipeline, basic analytics, and content configurations“
  2. Personalization requires “data analysts and product managers to use more sophisticated tools such as recommendation systems and live events managers“
  3. Community management requires “customer support staff, marketing, and community managers … guild systems, user-generated content, and multiplayer services for matchmaking, cross-network play, and communications.”

The guide presents these as sequential steps of maturity. In my experience with growth, 1 and 3 came before 2, since generating per-user state was relatively resource intensive. Also, we could start with relatively naive approaches to 2 and 3, eg friend recommendations by a static topic like “sports”, and then layer on more sophisticated alternatives, eg per-user behavioral predictions.

Connecting to people

LiveOps has a user-centric perspective: “LiveOps developers know that players and communities evolve. When creating a game, we’re not movie directors with a singular vision, but more like TV network program managers … LiveOps games are player-centric and react to player desires and needs …”

I’m a fan of a customer-centric perspective. Differentiating user-centric seems like it should be obvious, but it’s nice to see it emphasized.

My recent experience is in growth as a service, which is why I differentiate “users” from “customers” (Customers would be apps that have users/players.)

Acquisition

“With LiveOps, acquisition is an ongoing process” I guess this recognizes that people may come and go from a game, although in the terminology I’m familiar with, returning would be “resurrection” or “reactivation”. (“Reactivation” is listed later as an example of acquisition.)

I appreciate the list of common acquisition sources:

  • Store discovery
  • Public relations
  • Advertising
  • Cross-promotion
  • Influencer marketing
  • Social installs, eg shares
  • Reactivation

Helpful tip: “Track player engagement and retention based on source of acquisition and look for trends” Platforms providing acquisition channels should also provide attribution, eg Google Play Store’s campaign attribution passed to Android’s Install Referrer API.

Kind of obvious, but the guide recommends A/B testing reactivation inducements. Later the guide simply recommends testing everything all the time.

Retention

Retention is “one of the only data-supported ways to know if players enjoy playing“

Common techniques for increasing retention:

  • Adding content
  • Stickiness due to investment – this comes up later in the “conversion” section
  • Social connections
  • Compelling game mechanics, eg Go has “simple rules that allow for endless new strategies”

Helpful tip: “Try to communicate only what’s interesting and valuable, and mix up rewards so they don’t become background noise” I’ve heard this phrased as “fatigue” Messaging platforms should provide features to help customers avoid fatiguing users.

Engagement

The definition of “engagement” or “active” usage is often mysterious to me, so I appreciate the general description: “Active communities engage with a game by playing, providing feedback, and promoting (discussing online or in person, creating fan content, and so on) … Common reporting period intervals include 1-day, 7-day, and 30-day.” An arbitrary post from SurveyMonkey has some context for MAU.

Interesting: “engagement is the only KPI that some studios measure.“

Another relatively obvious tip: “Look at how studios with games like yours engage their community as a baseline for your own engagement efforts”, ie “competitive analysis”. But still, as a general primer, I appreciate the comprehensiveness.

Support

“Your team needs the tools to isolate and identify problems, fix or escalate them … and communicate with players throughout the process.” 👍

Common tools:

  • External-facing ticketing, so internal and external actors can coordinate
  • “ ability to look up individual player profiles and make manual changes”. Ideally, a customer can do this themselves.
  • “A way to suspend or ban players”
  • “A way for players to upload crash logs” Seems this could be automatic, eg Crashlytics
  • “Ways to send messages to players” (and customers)

Good tip: “Changes in support contact and resolution rates (e.g. number of support tickets opened and closed) can indicate larger issues.”

Data

Analytics

I like the list of common metrics:

  • ARPU (Average Revenue Per User) … general business health“. I’m guessing percentiles would be good too
  • ARPPU (Average Revenue Per Paying User) … for gauging monetization strategies, such as store design or DLC“
  • (from the monetization section) “Paying rate is just as important as ARPPU for measuring monetization” I get the impression paying rate refers to the percentage of users who pay for anything
  • Unique Logins …  indicates newly acquired players”
  • Conversion Rate … success at converting free players into paid players.”
  • Retention … how well your game keeps players interested”
  • Average Session Length … how long your gameplay loop stays fun”
  • Session Frequency … how often players engage with the game”
  • LTV (Lifetime Value) … the total number of unique players divided by total revenue generated”
  • Errors … how stable your game is”
  • Content Logs … popularity, stability, and engagement of specific game content” This seems relatively game-specific, but I guess it could be generalized to feature-specific metrics

Good point: “Some metrics are best reviewed over long periods of time (e.g. Avg. Revenue), while others benefit from constant real-time updates (e.g. Errors)” And this may change over time, eg crash rates while changing a flag value.

Interesting: “Instead of boosting acquisition through marketing or app store advertising, they built traction by focusing on early retention metrics such as daily active users, session length, and crashes“ and “direct player feedback”

Good idea: “implementing direct player feedback through a public Trello community board,  letting users log bugs directly, and holding community votes on what to work on next.“

Good point: “Knowing your retention rate is important, but offers no insight on how to fix it. For that, you need to do a deep drill-down or segment your audience and experiment.”

Segmentation

“Segmenting groups is a necessary step to deliver the best content to the most players”

Good tip: “your analytics toolset should let you define custom segments”

Common use-cases:

  • “Designers segment players based on in-game behavior to understand their needs and develop player-centric content” presumably to increase retention
  • “Monetization teams use segments to understand spending patterns, identify fraudulent behavior, and predict revenue”
  • “Marketers create custom segments and optimize messaging for each to acquire or engage players”

“The most important thing about the testing aspect is the cohort and the segmentation …” 🤔 I guess an example would be identifying a low spending segment to test a feature to increase spending, as opposed to testing it on everyone, some of whom may already by spending a max.

A basic funnel:

  • New users
  • Non-spenders
  • Spenders
  • High spenders

“Once you [define a funnel like this], it’s easy to track your progress getting players to move through the funnel from one segment to the next.”

Good tip: “machine learning can help you automatically segment players”

Experimentation

“Good experiments have a hypothesis or some sort of goal KPI to change” 👍

I’m glad this is stated: “The size of your audience can affect how complex your testing can be. A game with millions of players can easily test subtle changes, but one with a smaller audience will only get significant data from tests with stark variations. The same goes for how many tests you can run simultaneously—a smaller player base means fewer simultaneous tests are statistically reliable.” I’d also say an opinionated approach, direct feedback and/or severely limited test concurrency can be a more efficient guide for a small user base than cluttering code with conditional logic and waiting a long time for significant data. Nice: “monitor user feedback … when player data is in short supply.”

Good tip: “be sure the test encompasses at least a whole week to measure fluctuations between weekday and weekend players” and users in different regions.

Interesting: “Make sure if one player sees something different from another, they can clearly understand why” I wonder if an example would be providing UI to list active experiments.

In-game surveys should “only ask one question at a time”

“Failed experiments are an important part of the process to learn” 👍

Best practices:

  • “Learn which metrics best capture performance for your game’s KPIs,and set appropriate periods to monitor and review them”
  • “Test gameplay mechanics early. It’s harder to test changes … after players have developed expectations” Reminds me of changes to Twitter UX basics, like changing the ⭐️ → ❤️
  • “When players have problems, analyze event history …” which implies an ability to collect and analyze such history is important, which may not be obvious before an issue happens
  • “Use limited-time events to test changes to gameplay—players are often more tolerant of gameplay changes when called out as events” Good idea. Reminds me of sports-based features, eg World Cup. I hadn’t thought of them as an opportunity to experiment w basic mechanics.
  • “Chart out the “funnel” progression for players in your game and experiment with ways to motivate players to move through your funnel”
  • “Ensure your analytics tools let you view KPIs by segment”
  • “Establish a clear success metric to gauge the impact of tests”
  • “Test qualitative factors by polling players with in-game surveys”

Launching

“It helps to put together a designated LiveOps team” I’ve also seen feature teams own their launches.

This seems like a launch checklist:

  1. Feedback loop and KPIs
  2. Support channels and data access guidelines
  3. Incident response strategy

Soft launch

Example soft launch: “choose a smaller geographic area, ideally with the same language as your core audience … and run your game for a few months” or “ limiting your initial audience with an Early Access or Beta period”. EAP and beta are something I have more experience with.

Good idea: “pay close attention to the core engagement metrics” for soft launch

Good idea: “During soft launch, confirm that you can update the game without causing disruption to players – and make sure that you can roll back changes if problems arise during deployment”, ie verify LiveOps infra works as expected.

“Many developers are moving away from soft launches in favor of lean launches“ 🤔… “As a small, indie studio, you don’t have the money to do user acquisition for a soft launch”

Lean launch

A lean launch:

  1. deploys an MVP version of the game
  2. connects with a target audience, and then 
  3. tunes the game based on player data and feedback

Requirements:

  • reliable data pipeline
  • smaller manageable audience without inflated expectations
  • be able to adapt your game quickly

“Collecting and analyzing your crash data and retention metrics is a must”, which is “ dependent on an effective LiveOps pipeline that allows for developing several pieces of content at once, and agile deployment”

Best practices

  • “Assemble a LiveOps team”
  • “Develop a calendar” to coordinate live updates post-launch
  • “Put validation checks in place” I guess because this approach is premised on making lots of significant changes, so the cost of failure is high
  • “Rehearse key LiveOps tasks”, which is good advice, but kind of contradicts an earlier statement “There’s no such thing as a dry run in live games”
  • “Ensure your team has a way to roll back changes ”
  • “Set roles and permissions”

Game updates

“Game updates aren’t limited to new levels or game mechanics. They can consist of new items for purchase, events, balance patches, bundles, or anything else that encourages a player to come back and play more.”

“Understanding your player base is a key element in designing and delivering relevant updates”

“Frequency and consistency are as important as quality when making updates”

Tip: experiment with time between updates in addition to the update content “to see if they impact engagement or retention.”

“save client updates for entirely new features or large assets … assets such as art and gameplay logic are included in the client, but how those assets are displayed to players is driven by server-side logic … plan your content architecture in advance and move as much of your game logic as possible onto the server or cloud.”

Best practices

  • “Make a list of everything in your game that could be considered ‘content’”
  • Plan how content will get to the client 👈
  • “Think about offline mode” 👍
  • “Vary your updates” between temporary and permanent changes
  • “Consider targeting new content to specific player segments”
  • “Consider cloud streaming or downloading assets in the background during gameplay to reduce friction”

Events

“A live event is any temporary but meaningful change to a game’s content”

“Anything can be an event … Timebox it, reward it, there you go …”

“Successful events often include:

  • A measurable goal …
  • A limited-time period …
  • Engaging themes and content …
  • Surprise and predictability …
  • A sense of community effort …
  • An effective means of communicating with players …”

Reminds me of a “campaign” in other applications of targeted content.

Experiment w event frequency: “ By experimenting with event timing, they were able to settle on an event schedule that raised their baseline engagement while also minimizing lapsed players”

“Consider running repeatable events … Holidays work because players will be more understanding of temporary changes, and often have more time to play”

“Adding a special, limited-time leaderboard for a specific in-game goal is a common event.”

“Events can also run in parallel”

Calendars

A calendar can help reduce the complexity of orchestrating events and avoid fatiguing users.

Communication

“Great player communication is critical to the success of live events”

Push notifications, email and social media are common channels of event communication.

Best practices

  • “Make a list of everything you might want to change as part of an event”
  • “Prepare to run events from the server, without a client update”
  • “Find natural ways to promote upcoming events in-game”
  • “Capture event data in your data warehouse“ for later analysis and segmentation
  • “Let your team be flexible when creating events” This seems like basic team management; micro-managing is bad
  • “Set goals for events” so we can evaluate performance
  • Maintain a calendar for coordination and to avoid fatiguing users
  • “Use events to experiment with ideas”
  • “Establish an event framework” that separates unique and repeatable aspects of an event

Monetization

“… every discussion about monetization should consider:

  • The kind of game you’re building …
  • … [aligning] player needs with your revenue goals …
  • Ethical guidelines for monetization …
  • How your competition is monetizing … “

Microtransactions

Aka “in app purchases” 👈

Common forms:

  • “Cosmetics are items that affect the physical appearance …”
  • “Account Upgrades are permanent enhancements to a player account …”
  • “Consumables are items that can be used once for a temporary effect …”
  • “VIP Programs are subscription-based programs …”
  • Content Access
  • “Random Boxes (or loot boxes) are items players can purchase without knowing exactly what they’ll receive”

Common “elements of in-game store management:

  • Presentation … should be easy to use …
  • Catalog management … (A good rule of thumb is once a week.) …
  • Pricing …
  • Offers and promotions …
  • Fraud … As soon as you start offering items with real-world currency value, there will be fraud …”

Nice: “Use server-side receipt validation … for added security”

Conversion

I really like this topic. From the growth perspective, this is part of acquisition.

“two main challenges:

  1. eliminating the barriers to entry
  2. showing your players value”

The first one I’ve come to see a fundamental product consideration. If we want people to do anything, we need to minimize the cost of doing that thing. I think this also ties into an engineering best-practice: keep migrations and changes separate.

Regarding the second point, I think a great counter-example is a paywall before showing any content. “players have more of a propensity to pay once they have a trust

relationship with the game and the developer”

How players spend

I don’t have experience with in-app purchases, so this is all interesting.

“Players will have different levels of spending they are comfortable with”

“It’s easy to get caught up focusing on big spenders or trying to sell as much as possible as soon as the game launches. But those methods are often unreliable,

unsustainable, and may reflect poorly on your studio” Reminds me of low-quality ads, which eventually drive users off the platform.

“Build a broader, more reliable, and engaged spending base rather than chasing whales’ … A thousand players paying $10 is preferable to ten players paying $1000 because there is more opportunity for repeat purchases.”

Advertising

“One of the most popular forms is rewarded video—short videos often promoting a different game or app, watched for an in-game reward or more playtime … [beware] players might be lured away by a competitor’s game.”

“As with almost every other LiveOps effort, you need to continuously test different solutions.”

Good idea: ”Many developers segment their audience and only show ads to certain segments, often limiting them to non-paying players.”

Targeting

“You can usually do an on-the-fly calculation to compare the value per impression of an in-house-ad versus one from an external network, so you can decide what to show for a given player segment.”

Economies

“Many games use two virtual currencies: a “soft” currency earned ingame, and a “hard” purchased currency”

“Build a matrix of all the sources and sinks for in-game resources and build a model of the economic activity you can adjust in a tool such as Microsoft Excel, without rolling out updates.” I’ve heard of managing config this way.

“What we want is sustained investment and signs that a player has really perceived value…”

Best practices

  • Chose a strategy
  • Set ethical and quality guidelines
  • Prevent fraud
  • Simplicity and variety
  • Bundle commonly purchased items
  • Pair sales with events ← this reminds me of the growth practice of requesting feedback when engagement is high
  • Incentivize social sharing
  • Diversify ad networks
  • Keep loss aversion in mind
  • Always be testing “Never stop testing your monetization efforts, because your players’ perception of value (both real-world and in-game) will change over time“

Multiplayer

“… detailed documentation on multiplayer architecture at playfab.com/multiplayer”

Leaderboards

“As soon as you add a leaderboard in a game, even if it’s a single-player game, players start seeing progress against other people, and people all of a sudden start engaging more” Makes me think there are mechanics for games based on human behavior comparable those used by growth features. For example, leaderboards increasing engagement highlights a human response to hierarchy.

Filtering makes leaderboards more fun:

  • Geo
  • Platform
  • Mode, eg player-vs-player
  • Option, eg difficulty
  • Level
  • Statistic, eg # wins
  • Time, eg stats for today

“combining the variables Platform, Level, and Statistic you could create a leaderboard for ‘Fastest time (Statistic) to complete Ventura Highway (Level) by PC players (Platform).’”

Leaderboards can also encourage social behavior, eg biggest contributor to team

An ability to reset the leaderboard can encourage participation

Award prizes based on achievements shown in the leaderboard.

Groups

“Groups … can get players more invested in a game”

Some group dynamics:

  • Communication
  • Game progress
  • Stats 

I wonder if these can be used for other groups, eg a working group.

Interesting: “Determine how short-term groups are formed based on how much players need to trust teammates to succeed … “

“Long-term groups (such as guilds) have been proven to increase player retention …” Seems like a form of “investment” that makes an app stickier. The fact that it was “proven” makes me think there might be papers to read.

“… how do I provide you the best experience not only within your guild, but when your guild is gone… It comes down to matchmaking … the right aspiration together as a group.” Reminds me of work dynamics.

Managing communities

“A dedicated community manager can help keep players satisfied and foster a positive community …” Reminds me of the dev “advocate” role

Some ways to avoid bad behavior:

  • Limiting communication options
  • Filtering words and phrases
  • Defining a code of conduct

“The team behind Guild Wars 2 reportedly built the whole game around the idea that ‘players should always be happy to see one another.’” 🙂

“The more you can provide a framework for people to operate in, the more likely they are …“

Localization

“50% or more of online users will only buy when presented offers in their native language.”

Good idea: given the localization team access to edit strings

“Store as much of the in-game text on the server as possible, so it can be easily edited and localized”

Best practices

  • Consider multiplayer early in development
  • Add multiplayer elements whenever possible
  • Experiment with matchmaking algorithms
  • Plan for multiplayer scaling needs
  • Offer multiple ways to communicate
  • Enable customization of groups, to increase engagement
  • Reset leaderboards on a regular basis
  • Award prizes based on leaderboard stats
  • Enable users to “refresh” game to explicitly load new config
  • Localize communications(!) 

Tools and services

The guide lists PlayFab’s API, but I think it’s more interesting as an overview of useful entities and controls:

  • Auth
  • Content
    • Game content
    • User generated content
  • User data
  • Matchmaking
  • Leaderboards
    • Tournaments
    • Reset schedules
    • Prizes
    • Fraud prevention
  • Communication
    • P2p
    • Text and voice with transcription and translation
    • Accessibility (speech to text and vice versa)
  • Eng controls
    • Config
    • Reporting
    • Events
    • Automation
    • Scheduling
  • Product & community controls
    • Reporting
    • Event log
    • User management
    • Automation
    • Scheduling
    • Segmentation
    • Experimentation
    • Messaging
  • Economics controls
    • Stores
    • Sales
    • Economy
    • Fraud prevention

Software ecology

A documentary about an eco-friendly home near Austin inspired me to think about software systems from an ecological perspective.

The notion of a software or product “ecosystem” isn’t new, but I’d previously only thought about it as fostering healthy interactions in a system; I hadn’t considered the non-human actors. Is the code hard to maintain? Are alerts waking people up unnecessarily? Is the business sustainable? Is there a natural order? Is anything out of place, like an old tire in a stream? Can we achieve our goals in harmony with the natural order?

For example, I worked on a free service that would alert when resources were exhausted. Because it was free, it was natural for consumers deprioritize efficient usage. Maintainers of the service absorbed the cost in the form of routine alerts. A more balanced system would shift some cost to the consumers.

I think the idea of separating concerns is another example. Decoupling can reduce maintenance cost even if the functionality doesn’t change.

A colleague once remarked that every syntax variation allowed by a language would eventually appear in a code base; a convention could not stop this. Perhaps this was another example of an imbalanced natural order. The cost of enforcement was solely on the reviewer. Shifting this cost to programmatic validation, like a linter, would help restore balance.

Binary prioritization

Prioritizing unplanned work is hard, especially when we’re overwhelmed.

In this context, I’ve found it helpful to split things into two groups: important or not. My team calls this “binary prioritization”. We work on the important stuff and forget everything else; anything misidentified as unimportant will bubble back up.

Brandon Chu’s essay “Ruthless Prioritization” provides a helpful distinction: work within a project vs work between projects. The former benefits from binary prioritization; the latter from a more rigorous process, like OKRs.

This also reminds me of Amazon’s “Bias for action” principle, and the Agile principles. For example, Agile embraces change (“Agile processes harness change for the customer’s competitive advantage”). Binary prioritization enables us to embrace a high volume of change.

Bias for action

This is straight from Amazon’s leadership principles. It also seems related to binary/ruthless prioritization, the Agile principle of “Simplicity — the art of maximizing the amount of work not done”, and customer focus.

An ambiguous problem may have many possible solutions. We risk slowing to a halt investigating all of them. It’s helpful to regularly question if a given task is important. “Analysis paralysis” might describe this, but I prefer Amazon’s phrasing: “Bias for action”.

As a concrete example, in an interview, there may be many possible ways to solve a problem, but time is limited and solving the problem by any means is preferable to identifying all possible solutions without implementing any.

A pattern for reducing ambiguity

Here’s the pattern:

  1. Identify the owner
  2. Have the owner describe the problem and solution in a doc
  3. Invite stakeholders to comment on the doc
  4. If discussions on the doc linger, schedule a meeting to finalize
  5. Document conclusions in the doc, to facilitate communication and closure, and for future reference

This may seem obvious, but I often forget it, especially in cases where a task starts small and grows in complexity. A problem may seem too small to formally “own”. A solution may seem too trivial to document. Stakeholders may attend a meeting without context. A meeting may conclude with stakeholders feeling like they’ve expressed themselves, but there’s no actionable plan to resolve the problem.

Identifying one person from a group of stakeholders to own the project and be responsible for leading work to completion reduces organizational ambiguity.

Documenting the problem and proposed solution in writing reduces ambiguity by capturing ideas from a variety of mediums in a single, relatively objective place that stakeholders can comment on.

Documentation alone may achieve closure, but it may also spawn extensive commentary. Meetings are relatively expensive, but scheduling a meeting to drive closure reduces ambiguity by distilling commentary on the document into conclusions.

Documenting conclusions reduces ambiguity by rendering them in an objective form all stakeholders can agree on.

A few symptoms that indicate when this pattern might be useful:

  1. Endless back-and-forth in chat, bug comments, etc, which can give the impression of progress, but never resolve the issue
  2. Multiple and/or cross-functional stakeholders, which can obscure priorities
  3. Multiple people opining on a solution and/or answering questions, which can obscure ownership
  4. A problem that drags on, which can indicate it’s important, but inappropriately owned

This ties into larger discussions around project planning (e.g., managing planned vs unplanned work), and meeting efficiency (e.g., inviting stakeholders, assigning pre-work and clarifying outcomes), but the point here is just to succinctly identify an organizational pattern and when it can be helpful.