Peacemaker

The API deployment is built with a Dockerfile and deployed using Nomad. Jenkins mainly orchistrates our deployments using the API's provided by the Hashicorp stack.

Current deployment strategy

Any merges into staging should trigger a job on jenkins.wealthfit.com that will deploy the new changes. Similarly, any merges in master will perform the same under production-jenkins.wealthfit.com.

What is Truthseeker

In the root of the peacemaker repository lives two nomad template files, one for each environment. These templates are to be used with consul-template to authenticate and pull k/v & secrets from consul and vault. To help aid with this process, there exists a utility library called Truthseeker that does three things:

  1. Building Terraform templates & Nomad jobs via consul-template for securely grabbing keys and secrets from Vault & Consul
  2. Helper methods to update Consul's K/V store & run Nomad jobs.
  3. TruthKeeper, a GenServer which is used as a data store to help determine the next available load balancer and UI ports to run Fabio with.

What does this mean?

1a) Previously we had a feature where we had multi-branch deployments. Any PR that had development as the base branch had AWS resources configured to create a seperate environment with the PR's code deployed. The resources were configured using terraform. The modules can be found in the repo Pathfinder, though this feature is currently disabled.

  1. Register the PR app under fabio so that we can set a target group to be used by the ALB in AWS. (also deprecated)

  2. In order to achieve rolling releases for minimal downtime during deployments, we generate random strings in Consul's K/V store that will be set as the sname for each of the Elixir nodes. This ensures that different API versions don't cluster together since we use canary deployments as our strategy (eg: Peacemaker v1 only clusters with other Peacemaker v1 nodes).

  3. Fetching secrets from vault. The nomad template contains only the path to where the secret lives in vault. Truthseeker helps authenticate against vault and fetch the specific secrets using consul-template under the hood.

Troubleshooting

  • Appsignal

  • Nomad Logs can be found using the nomad bin. ie:

      nomad job status peacemaker
          ID            = peacemaker
          Name          = peacemaker
          Submit Date   = 2022-02-10T12:43:41-08:00
          Type          = service
          Priority      = 50
          Datacenters   = us-west-2a,us-west-2b,us-west-2c
          Status        = running
          Periodic      = false
          Parameterized = false
    
          Summary
          Task Group      Queued  Starting  Running  Failed  Complete  Lost
          peacemaker-api  0       0         3        0       18        0
          Latest Deployment
          ID          = 2c1e9d49
          Status      = successful
          Description = Deployment completed successfully
          Deployed
          Task Group      Auto Revert  Promoted  Desired  Canaries  Placed  Healthy  Unhealthy  Progress Deadline
          peacemaker-api  true         true      3        3         3       3        0          2022-02-10T20:54:32Z
          Allocations
          ID        Node ID   Task Group      Version  Desired  Status   Created   Modified
          1fc09ca7  298c4ff9  peacemaker-api  336      run      running  7d2h ago  7d2h ago
          9d0fa8e1  03b9097f  peacemaker-api  336      run      running  7d2h ago  38m28s ago
          c527dadb  46b2652b  peacemaker-api  336      run      running  7d2h ago  7d2h ago
          ubuntu@ip-69-0-2-37:~$ nomad alloc logs 9d0fa8e1
      nomad alloc logs <allocation-id>
          ex:
              Server: peacemaker.wealthfit.com:80 (http)
              Request: PUT /api/account/password
              ** (exit) an exception was raised:
                  ** (RuntimeError) cannot encode association :allowed_courses from Peacemaker.Account to JSON because the association was not loaded.
              You can either preload the association:
                  Repo.preload(Peacemaker.Account, :allowed_courses)
              Or choose to not encode the association when converting the struct to JSON by explicitly listing the JSON fields in your schema:
                  defmodule Peacemaker.Account do
                  # ...
                  @derive {Jason.Encoder, only: [:name, :title, ...]}
                  schema ... do
                      (ecto 3.5.8) lib/ecto/json.ex:4: Jason.Encoder.Ecto.Association.NotLoaded.encode/2
                      (peacemaker 3.3.5-rc.5) lib/peacemaker/accounts/account.ex:65: Jason.Encoder.Peacemaker.Account.encode/2
                      (jason 1.2.2) lib/encode.ex:172: Jason.Encode.map_naive/3
                      (jason 1.2.2) lib/encode.ex:35: Jason.Encode.encode/2
                      (jason 1.2.2) lib/jason.ex:197: Jason.encode_to_iodata!/2
                      (phoenix 1.5.9) lib/phoenix/controller.ex:776: Phoenix.Controller.render_and_send/4
                      (peacemaker 3.3.5-rc.5) lib/peacemaker_web/controllers/api/account_controller.ex:1: PeacemakerWeb.AccountController.action/2
                      (peacemaker 3.3.5-rc.5) lib/peacemaker_web/controllers/api/account_controller.ex:1: PeacemakerWeb.AccountController.phoenix_controller_pipeline/2
                      (phoenix 1.5.9) lib/phoenix/router.ex:352: Phoenix.Router.__call__/2
                      (peacemaker 3.3.5-rc.5) lib/peacemaker_web/endpoint.ex:1: PeacemakerWeb.Endpoint.plug_builder_call/2
                      (peacemaker 3.3.5-rc.5) lib/peacemaker_web/endpoint.ex:3: anonymous fn/3 in PeacemakerWeb.Endpoint."call (overridable 3)"/2
                      (appsignal 2.1.9) lib/appsignal/instrumentation.ex:10: Appsignal.Instrumentation.instrument/1
                      (peacemaker 3.3.5-rc.5) lib/peacemaker_web/endpoint.ex:1: PeacemakerWeb.Endpoint."call (overridable 4)"/2
                      (peacemaker 3.3.5-rc.5) lib/plug/error_handler.ex:65: PeacemakerWeb.Endpoint.call/2
                      (phoenix 1.5.9) lib/phoenix/endpoint/cowboy2_handler.ex:65: Phoenix.Endpoint.Cowboy2Handler.init/4
                      (cowboy 2.9.0) /opt/app/deps/cowboy/src/cowboy_handler.erl:37: :cowboy_handler.execute/2
                      (cowboy 2.9.0) /opt/app/deps/cowboy/src/cowboy_stream_h.erl:306: :cowboy_stream_h.execute/3
                      (cowboy 2.9.0) /opt/app/deps/cowboy/src/cowboy_stream_h.erl:295: :cowboy_stream_h.request_process/3
                      (stdlib 3.15.2) proc_lib.erl:226: :proc_lib.init_p_do_apply/3
    • Nomad UI is also available via tunnel-[staging,production]-nomad-[a-c], or Checkout here for the under-the-hood methods.

Errors

Tips & Tricks

This script pings the /_internal/version endpoint for the Peacemaker API every second. This is helpful during deployments to ensure that the new version is deployed. You should also be able to watch the version slowly roll over, like demonstrated here

while true; do curl -k https://peacemaker.wealthfit.com/_internal/version; sleep 1; done

Adding Secrets:

There exists a tool in the design-systems repo under the wf npm run script that can be used to add secrets to vault. Otherwise you can manually open a SSH tunnel (via tunnel-[staging,production]-vault-[a-c]) to any of the vault nodes on port 8200. (Note: SSL isn't setup around vault due to time constraints, so make sure to access the vault under http).

The secret format looks something like the following:

{{with secret "secret/mux"}}{{.Data.access_token_id}}{{end}}

which can be seen as the following:

{{with secret "[path_to_]/[vault_secret]"}}{{.[key].[value]}}{{end}}
``

WARNING: UNPROTECTED PRIVATE KEY FILE!

Fix: chmod 400 ~/.ssh/private_key_file_here.pem stackoverflow reference

Manual Deployments (was written in 2018 but still technically how things work under the hood). Some items may be outdated.

Prequisites

  • Acquire private SSH keys from an admin.
  • Install nomadG
  • Install docker

Configure SSH

Copy the following into `~/.ssh/config` ``` Host pathfinder-staging-bastion HostName ec2-18-205-194-8.compute-1.amazonaws.com User ec2-user Port 22 IdentityFile ~/.ssh/wealthfit-staging-pathfinder.pem ForwardAgent yes GSSAPIAuthentication no PasswordAuthentication no ChallengeResponseAuthentication no StrictHostKeyChecking no UserKnownHostsFile=/dev/null GatewayPorts yes Host peacemaker-staging-a ForwardAgent yes UserKnownHostsFile=/dev/null GatewayPorts yes User ubuntu Port 22 ProxyCommand ssh pathfinder-staging-bastion nc 10.1.1.233 22 IdentityFile ~/.ssh/wealthfit-staging-pathfinder.pem StrictHostKeyChecking no Host peacemaker-staging-b ForwardAgent yes User ubuntu Port 22 ProxyCommand ssh pathfinder-staging-bastion nc 10.1.2.11 22 IdentityFile ~/.ssh/wealthfit-staging-pathfinder.pem StrictHostKeyChecking no Host pathfinder-staging-consul-a ForwardAgent yes UserKnownHostsFile=/dev/null GatewayPorts yes User ubuntu Port 22 ProxyCommand ssh pathfinder-staging-bastion nc 10.1.1.126 22 IdentityFile ~/.ssh/wealthfit-staging-pathfinder.pem StrictHostKeyChecking no Host pathfinder-staging-consul-b ForwardAgent yes UserKnownHostsFile=/dev/null GatewayPorts yes User ubuntu Port 22 ProxyCommand ssh pathfinder-staging-bastion nc 10.1.2.204 22 IdentityFile ~/.ssh/wealthfit-staging-pathfinder.pem StrictHostKeyChecking no Host pathfinder-staging-nomad-a UserKnownHostsFile=/dev/null GatewayPorts yes ForwardAgent yes User ubuntu Port 22 ProxyCommand ssh pathfinder-staging-bastion nc 10.1.1.143 22 IdentityFile ~/.ssh/wealthfit-staging-pathfinder.pem StrictHostKeyChecking no Host pathfinder-staging-nomad-b ForwardAgent yes UserKnownHostsFile=/dev/null GatewayPorts yes User ubuntu Port 22 ProxyCommand ssh pathfinder-staging-bastion nc 10.1.2.129 22 IdentityFile ~/.ssh/wealthfit-staging-pathfinder.pem StrictHostKeyChecking no ```

Caveats

Note: You may change the HostName value to whatever you want. This will be the name used when you ssh into the container (eg: ssh peacemaker-staging-a).

It's important that the private IP addresses are correct. During development, containers may be destroyed/recreated with new private IP addresses.

This command will open a tunnel, forwarding any requests on pathfinder-staging-nomad-a:4646 to your localhost:4646 so when you open localhost:4646 in the browser, we should be able to view the Nomad Web UI.

Run this command in your terminal: ssh -L 4646:localhost:4646 pathfinder-staging-nomad-a -N

Running API Job

  • cd to the root of the project
  • docker build -t wealthfit/peacemaker:0.0.0 .
    • This will create a docker image named wealthfit/peacemaker with the tag of 0.0.0. Versioning control processes will be addressed in the near future. For the time being, let's prevent bumping this version tag. You can still create other tags (eg: 0.0.0-test, test, foo)
  • Validate that the docker build successfully runs docker run wealthfit/peacemaker:0.0.0
    • secrets.prod.exs will expect a DATABASE_URL environment variable. The defaulted DATABASE_URL in the Dockerfile points to our production RDS. This means if we run our docker image locally without editing the DATABASE_URL, the image will fail when running migrations.,
  • Update api.nomad to use the new docker image tag.
  • Validate that nomad job plan api.nomad is what is expected.
  • nomad job run api.nomad
  • If you view the Nomad Web UI, you should be able to see the job running. Any logs / debugging can be done through the web UI upon failure.

Future Notes

I am hoping... that this document will eventually be deprecated once the CI/CD pipeline is complete. The ideal workflow is to have developers just push code, and let the infrastructure handle the rest. This process is a bit convoluted right now, and I am planning to simplify this with automation pipelines. I don't foresee this process scaling well because of potential security risks.