Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Environments

  • Production - domain is seamlessaccess.org
  • Beta & staging - domain is thiss.io
  • Staging - domain is thiss.io

SeamlessAccess Services

...

Take help of the 'Description & Troubleshooting' section above to troubleshoot the alarms. Se also GeneralTroubleshooting.

Upgrade & Verification

  • Both PyFF and sunet-md_publisher are upgraded by chaging the versions in thiss-ops/global/overlay/etc/puppet/cosmos-rules.yaml. The puppet manifests for production, beta and staging are separate.
Code Block
   thiss::pyff_prod:
      pyff_version: 2.1.3
      output: /var/www/html/metadata.json
      output_trust: /var/www/html/metadata_sp.json
   thiss::md_publisher_prod:
      watch: /var/www/html/metadata.json
      watch_sp: /var/www/html/m
  • After commiting and bump-taging the changes, run cosmos in the concerned servers, better to do it one at a time & check that the service is working.
  • If PyFF is upgraded, run the aforementioned cronjob for PyFF to see that it doesn't show any error.
  • You have to restart sunet-md_publisher if you have upgraded the metdata publishing service. See GeneralTroubleshooting
  • Check https://monitor.seamlessaccess.org/nagios3/ for any alarms.
  • The thiss-mdq servers with the name md-*.seamlessaccess.org should be able to fetch the metadata from the Aggregator & Publisher servers. Make sure it is all 'green' for those servers too.
  • You can log in to the thiss-mdq servers and run /usr/local/bin/get_metadata.sh and see that they are able to fetch metadata files without any issues.
  • As a last & final check, visit any SP for example wiki.sunet.se and see that it is possible to login using SA discovery service or check the login using https://demo.seamlessaccess.org/ for production upgrades and https://demo.beta.seamlessaccess.org for Beta upgrades.

thiss-mdq

Servers

We run docker.sunet.se/pyff as docker image. ci.sune.se (Jenkins)  builds that image every day with newer versions of dependencies and push it to docker.sunet.se.

We should make sure to run an image that is manually tagged specially in production and has been tested beforehand either in SeamlessAcess beta environment or in other test environments in SUNET. We should not run an image with a tag that is automatically added by ci.sunet.se. By running a manual tag that has been tested, we do not risk running an image with dependencies that may not be compabile with underlying code for pyFF. Currently we are running a 'stable' tag that has been tested in EIDAS project.

  • Both PyFF and sunet-md_publisher are upgraded by chaging the versions in thiss-ops/global/overlay/etc/puppet/cosmos-rules.yaml. The puppet manifests for production, beta and staging are separate.


Code Block
   thiss::pyff_prod:
      pyff_version: eidas-2.1.3-stable
      output: /var/www/html/metadata.json
      output_trust: /var/www/html/metadata_sp.json
   thiss::md_publisher_prod:
      watch: /var/www/html/metadata.json
      watch_sp: /var/www/html/m


  • After commiting and bump-taging the changes, run cosmos in the concerned servers, better to do it one at a time & check that the service is working.
  • If PyFF is upgraded, run the aforementioned cronjob for PyFF to see that it doesn't show any error.
  • You have to restart sunet-md_publisher if you have upgraded the metdata publishing service. See GeneralTroubleshooting
  • Check https://monitor.seamlessaccess.org/nagios3/ for any alarms.
  • The thiss-mdq servers with the name md-*.seamlessaccess.org should be able to fetch the metadata from the Aggregator & Publisher servers. Make sure it is all 'green' for those servers too.
  • You can log in to the thiss-mdq servers and run /usr/local/bin/get_metadata.sh and see that they are able to fetch metadata files without any issues.
  • As a last & final check, visit any SP for example wiki.sunet.se and see that it is possible to login using SA discovery service or check the login using https://demo.seamlessaccess.org/ for production upgrades and https://demo.beta.seamlessaccess.org for Beta upgrades.

thiss-mdq

Servers

NameLocationEnv
md[1-2].aws1.geant.eu.seamlessaccess.orgFrankfurt, AWSProduction
md[1-2].aws2.geant.eu.seamlessaccess.orgN. California, AWS
md[1-2].ntx.sunet.eu.seamlessaccess.orgNutanix, SUNET
md[1-2].se-east.sunet.eu.seamlessaccess.orgSTO1v2, Safespring
md[1-2].
NameLocationEnv
md[1-2].aws1.geant.eu.seamlessaccess.orgFrankfurt, AWSProduction
md[1-2].aws2.geant.eu.seamlessaccess.orgN. California, AWS
md[1-2].ntx.sunet.eu.seamlessaccess.orgNutanix, SUNET
md[1-2].se-east.sunet.eu.seamlessaccess.orgSTO1v2, Safespring
md[1-2].thiss.ioSTO1v2, SafespringBeta

md-staging-2.thiss.io

STO1v2, SafespringStaging

...

Code Block
➜  ~ curl -k https://static.se-east.sunet.eu.seamlessaccess.org/manifest.json
{
  "short_name": "Seamless Access",
  "name": "Seamless Access Identity Selector",
  "description": "See https://seamlessaccess.org",
  "version": "2.1.98"
}
Code Block
root@static-1: ~ # curl -k -4 http://localhost/manifest.json
{
  "short_name": "Seamless Access",
  "name": "Seamless Access Identity Selector",
  "description": "See https://seamlessaccess.org",
  "version": "2.1.160"
}

We also have nagios checks on the accisibility of these web links on each level. Chekc also GeneralTroubleshooting.

Upgrade & verification 

The process is described in below link along with verification for both production and beta environments.

Seamless Access Software Deployment Guide#Frontend(service.seamlessaccess.org)

Seamless Access Software Deployment Guide#Frontend(use.thiss.io)

...

 https://seamlessaccess.org",
  "version": "2.1.98"
}
Code Block
root@static-1: ~ # curl -k -4 http://localhost/manifest.json
{
  "short_name": "Seamless Access",
  "name": "Seamless Access Identity Selector",
  "description": "See https://seamlessaccess.org",
  "version": "2.1.160"
}

We also have nagios checks on the accisibility of these web links on each level. Chekc also GeneralTroubleshooting.

Upgrade & verification 

The process is described in below link along with verification for both production and beta environments.

Seamless Access Software Deployment Guide#Frontend(service.seamlessaccess.org)

Seamless Access Software Deployment Guide#Frontend(use.thiss.io)

HAproxy Load Balancer for thiss-mdq

Servers

NameLocationEnv
md.aws1.geant.eu.seamlessaccess.orgFrankfurt, AWSProduction
md.aws2.geant.eu.seamlessaccess.orgN. California, AWS
md.ntx.sunet.eu.seamlessaccess.orgNutanix, SUNET
md.se-east.sunet.eu.seamlessaccess.orgSTO1v2, Safespring
md-lb.thiss.ioSTO1v2, SafespringBeta

Descripton 

There is one load balancer server running HAproxy which is placed in front of the two thiss-mdq servers per site. These server have the names md.*.seamlessaccess.org. These HAproxy servers are added in Fastly for the service md.seamlessaccess.org. Fastly forwards the non-cached HTTPS GET requests invoked by the users to one of these HAproxy servers which in turn forwards them to one of the thiss-mdq servers using round robin algorithm. These HTTPS requests handle metadata queires.

The HAproxy service runs in a docker container and the configuration of it is supplied by puppet manifests.

Mointoring & Troubleshooting

We have three specific checks for these load balancers for each site in https://monitor.seamlessaccess.org/nagios4/

  1. Monitor the date when the metadata JSON files are last modified from the https://<site link>/manifest.json
  2. SSL check and availability of the site links
  3. The string 'OK' is found in https://<site link>/status
  4. Monitor that both backends are up by checking HAproxy stats from http://<site link>:8404/stats. This link is accesible only by SUNET VPN for SUNET NOC members and the monitor server.

The site links are

https://md.ntx.sunet.eu.seamlessaccess.org

https://md.se-east.sunet.eu.seamlessaccess.org

https://md.aws2.geant.eu.seamlessaccess.org

https://md.aws1.geant.eu.seamlessaccess.org

Take help of GeneralTroubleshooting for fixing alarms. It may happen that thiss-mdq servers are unavailable which will cause alarm in HAproxy servers, then check the section for thiss-mdq servers to troubleshoot them.

Upgrade

SeamlessAccess HAproxy Upgrade

HAproxy Load Balancer for thiss-js

Servers

NameLocationEnv
mdstatic.aws1.geant.eu.seamlessaccess.orgFrankfurt, AWSProduction
md.aws2.geant.eu.seamlessaccess.orgN. California, AWS
md.ntx.sunet.eu.seamlessaccess.orgNutanix, SUNET
md.se-east.sunet.eu.seamlessaccess.orgSTO1v2Frankfurt, Safespring
md-lb.thiss.ioSTO1v2, SafespringBeta

Descripton 

There is one load balancer server running HAproxy which is placed in front of the two thiss-mdq servers per site. These server have the names md.*.seamlessaccess.org. These HAproxy servers are added in Fastly for the service md.seamlessaccess.org. Fastly forwards the non-cached HTTPS GET requests invoked by the users to one of these HAproxy servers which in turn forwards them to one of the thiss-mdq servers using round robin algorithm. These HTTPS requests handle metadata queires.

The HAproxy service runs in a docker container and the configuration of it is supplied by puppet manifests.

Mointoring & Troubleshooting

We have three specific checks for these load balancers for each site in https://monitor.seamlessaccess.org/nagios4/

  1. Monitor the date when the metadata JSON files are last modified from the https://<site link>/manifest.json
  2. SSL check and availability of the site links
  3. The string 'OK' is found in https://<site link>/status
  4. Monitor that both backends are up by checking HAproxy stats from http://<site link>:8404/stats. This link is accesible only by SUNET VPN for SUNET NOC members and the monitor server.

The site links are

https://md.ntx.sunet.eu.seamlessaccess.org

https://md.se-east.sunet.eu.seamlessaccess.org

https://md.aws2.geant.eu.seamlessaccess.org

https://md.aws1.geant.eu.seamlessaccess.org

Take help of GeneralTroubleshooting for fixing alarms. It may happen that thiss-mdq servers are unavailable which will cause alarm in HAproxy servers, then check the section for thiss-mdq servers to troubleshoot them.

Upgrade

SeamlessAccess HAproxy Upgrade

HAproxy Load Balancer for thiss-js

Servers

...

AWSProduction
static.aws2.geant.eu.seamlessaccess.orgN. California, AWS
static.ntx.sunet.eu.seamlessaccess.orgNutanix, SUNET
static.se-east.sunet.eu.seamlessaccess.orgSTO1v2, Safespring
static.thiss.ioSTO1v2, SafespringBeta

Descripton & Troubleshooting

There is one load balancer server running HAproxy which is placed in front of the two thiss-js servers per site. These server have the names static.*.seamlessaccess.org. These HAproxy servers are added in Fastly for the service service.seamlessaccess.org. Fastly forwards the non-cached HTTPS GET requests invoked by the users from https://service.seamlessaccess.org to one of these HAproxy servers which in turn forwards them to one of the servers running  thiss-js code using round robin algorithm.

The HAproxy service runs in a docker container and the configuration of it is supplied by puppet manifests.

Mointoring & Troubleshooting

We have below checks for these load balancers for each site in https://monitor.seamlessaccess.org/

  1. SSL check and availability of the site links
  2. Monitor that both backends are up by checking HAproxy stats from http://<site link>:8404/stats. This link is accesible only by SUNET VPN for SUNET NOC members and the monitor server.

The site links are

https://static.ntx.sunet.eu.seamlessaccess.org

https://static.se-east.sunet.eu.seamlessaccess.org

https://static.aws1.geant.eu.seamlessaccess.org

https://static.aws2.geant.eu.seamlessaccess.org

Take help of GeneralTroubleshooting for fixing alarms. It may happen that thiss-js servers are unavailable which will cause alarm in HAproxy servers, then check the section for thiss-js servers to troubleshoot them.

Upgrade

SeamlessAccess HAproxy Upgrade

Monitor

Server

NameLocationEnv
monitor.ntx.sunet.eu.seamlessaccess.orgSTO1v2, SafespringProduction

Descripton

This is a monitor server which runs Nagios4 to monitor the health and operations of the virtual servers in Production, Beta and Staging. The GUI is here https://monitor.seamlessaccess.org/.

Infromation regarding access is given here https://wiki.sunet.se/display/sunetops/Monitoring.

Mointoring & Troubleshooting

  • Run service nagios4 status to check the status
  • Run service nagios4 restart to restart the application.
  • Check logs in /var/log/nagios4/nagios.log
  • Many of the configurations from /etc/nagios4/ are controlled by puppet manifests.

Upgrade

No proper guide is available. It is usually upgrade when there's a newer version of Nagios available when we upgrade the OS of the server.

Demo Application

Server

NameLocationEnv
sp-test.seamlessaccess.orgSTO1v2, SafespringMixed

Descripton

This server runs Demo SP (service provider) applications for both Production and Beta. They are exposed in respectively https://demo.seamlessaccess.org/ and https://demo.beta.seamlessaccess.org.

The application runs in a docker container.

Mointoring & Troubleshooting

No specific monitoring is done for this service.

GeneralTroubleshooting can be used for troubleshooting.

Upgrade

By setting the version parameter in thiss-ops/global/overlay/etc/puppet/cosmos-rules.yaml or in the thiss-ops/global/overlay/etc/puppet/modules/thiss/manifests/demo_sp.pp.

Log


Server


NameLocationEnv
log

Descripton & Troubleshooting

There is one load balancer server running HAproxy which is placed in front of the two thiss-js servers per site. These server have the names static.*.seamlessaccess.org. These HAproxy servers are added in Fastly for the service service.seamlessaccess.org. Fastly forwards the non-cached HTTPS GET requests invoked by the users from https://service.seamlessaccess.org to one of these HAproxy servers which in turn forwards them to one of the servers running  thiss-js code using round robin algorithm.

The HAproxy service runs in a docker container and the configuration of it is supplied by puppet manifests.

Mointoring & Troubleshooting

We have below checks for these load balancers for each site in https://monitor.seamlessaccess.org/

  1. SSL check and availability of the site links
  2. Monitor that both backends are up by checking HAproxy stats from http://<site link>:8404/stats. This link is accesible only by SUNET VPN for SUNET NOC members and the monitor server.

The site links are

https://static.ntx.sunet.eu.seamlessaccess.org

https://static.se-east.sunet.eu.seamlessaccess.org

https://static.aws1.geant.eu.seamlessaccess.org

https://static.aws2.geant.eu.seamlessaccess.org

Take help of GeneralTroubleshooting for fixing alarms. It may happen that thiss-js servers are unavailable which will cause alarm in HAproxy servers, then check the section for thiss-js servers to troubleshoot them.

Upgrade

SeamlessAccess HAproxy Upgrade

Monitor

Server

NameLocationEnv
monitor.ntx.sunet.eu.seamlessaccess.orgSTO1v2, SafespringProductionProd


Descripton

This is a monitor server which runs Nagios4 to monitor the health and operations of the virtual servers in Production, Beta and Staging. The GUI is here https://monitorThe servers runs a syslog application to collect logs from service.seamlessaccess.org/.

Infromation regarding access is given here https://wiki.sunet.se/display/sunetops/Monitoring.

Mointoring & Troubleshooting

  • Run service nagios4 status to check the status
  • Run service nagios4 restart to restart the application.
  • Check logs in /var/log/nagios4/nagios.log
  • Many of the configurations from /etc/nagios4/ are controlled by puppet manifests.

Upgrade

No proper guide is available. It is usually upgrade when there's a newer version of Nagios available when we upgrade the OS of the server.

Demo Application

Server

NameLocationEnv
sp-test.seamlessaccess.orgSTO1v2, SafespringMixed

Descripton

This server runs Demo SP (service provider) applications for both Production and Beta. They are exposed in respectively https://demo.seamlessaccess.org/ and https://demo.beta.seamlessaccess.org.

The application runs in a docker container.

Mointoring & Troubleshooting

No specific monitoring is done for this service.

GeneralTroubleshooting can be used for troubleshooting.

Upgrade

...

. The server is specifically allowed in Fastly configuration, you can check that under Logging for the current version of service.seamlessacces.org configuration running in Fastly.

We have added Enrique Perez's SSH key and IP address so he can fetch the logs from under /var/log with the names sa.log

This is how it looks in /root/.ssh/authorized_keys of the server.

Code Block
command="/usr/bin/rrsync -ro /var/log/",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding ssh-ed25519 AAAAB3NzaC1yc2EAAAADAQABAAABAQDWOTGSoPh/+uNglvrLifb4jVhDLzGnAQlH3jagVnWFQKVieUNB2vlhrTtW/89+9uRUtjICa1gevGxICkavgaP8MIvOrgksgR+j+CakbwKe1gGmC5AqFb1kmbUOpeUrGDHYbWp46fOc0zTBxTqT1u93LAw/ZUHUMB3ETnmScrbvxC3JwA0qsU7bw73QCLM24epy8dvstFTLcNPcPC2TOCh86IkZpvJj38Hy5uqanWN6KceOtQBtOORJE6rAsBTpmhiVtE/AsvkEWKNk1g5uArULK/Dd6K7fMxkr0rv+YT9qot/z0xUqHe5RDn3E5w3ojV8x47/0V9l3eh9jrEf3l6u9 -var-log--command_key

There is a configuration in logrotate so the sa.log(s) are rotated for 30 days and will be removed afterwards.

Mointoring & Troubleshooting

  • Check /var/log/syslog if there's any issue with access for Enrique or any issue with rsyslog functionality.
  • Take help of applicable puppet manifests to understand the configuration and troubleshoot further.
  • Check in Fastly if there's any warning message in the service configuration for Logging for service.seamlessaccess.org

...

  • .

General Troubleshooting

Almost all services run in docker containers. They are addes as systemd units. The names start with sunet-*.

...

Server typeRules
AllSSH via SUNET's designated jump hosts
All

NRPE to monitor.seamlessaccess.org & nagiosxi.nordu.net

All

Egress/ougoing packets from all ports

HAproxy Load Balancer for thiss-js

HTTPS to internet

TCP 8404 (HAproxy stats port) to vpn1.sunet.se & monitor.seamlessaccess.org

HAproxy Load Balancer for thiss-mdq

HTTPS to internet

TCP 8404 (HAproxy stats port) to vpn1.sunet.se & monitor.seamlessaccess.org

thiss-js

HTTP to HAproxy Load Balancer for thiss-js in the same site & monitor.seamlessaccess.org

thiss-mdq for Production & Beta

HTTP to HAproxy Load Balancer for thiss-mdq in the same site & monitor.seamlessaccess.org

thiss-mdq for staging

HTTPS and HTTP to SUNET Load Balancers

Aggregator & publishers for Production & Staging

HTTPS to thiss-mdq servers in the same site & monitor.seamlessaccess.org

Aggregator & publishers for Beta

HTTPS to thiss-mdq servers in the same site,  monitor.seamlessaccess.org & sp-test.seamlessacess.org

Monitor

HTTPS to vpn1.sunet.se

HTTP to internet (for ACME challenges to renew Let's Encrypt certificate)

Demo Application

HTTPS to internet

Log

SSH access to Enrique Perez Arnaud & TCP 514 (syslog) to internet


Staging Metadata Service

...