Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

few thoughts about upgrade role #779

Open
glushakov opened this issue Oct 7, 2024 · 2 comments
Open

few thoughts about upgrade role #779

glushakov opened this issue Oct 7, 2024 · 2 comments
Assignees

Comments

@glushakov
Copy link

Hi.
Thanks for your work.
I based your role on a major upgrade of postgres and would like to share some of the cases I encountered.

  1. Patroni precheck https://github.com/vitabaks/postgresql_cluster/blob/master/automation/pg_upgrade.yml#L19
    doesn't consider schema to get access to Patroni API (http/https)

  2. Tablespaces.

  • when we have a custom tablespace, then during the execution of the schema precheck creating new TS directory. Then running pg_upgrade precheck, and fail occurs with an error that the directory with the TS is already in use. I chose the solution not to dump the creation of the TS, with the key --no-tablespaces. Otherwise we need to clean up the directories with the TS after the precheck scheme, but I don't like deleting data on production :)
  • I added an additional check that the TS is not inside PGDATA. Because upgrading with hardlinks and then deleting the old directory will be fatal
  1. In case of scheme precheck fail https://github.com/vitabaks/postgresql_cluster/blob/master/automation/roles/upgrade/tasks/schema_compatibility.yml#L109
    task with reinit not started https://github.com/vitabaks/postgresql_cluster/blob/master/automation/roles/upgrade/tasks/schema_compatibility.yml#L121
    and new pgdata is not cleared and postgres is not stopped, which makes it impossible to restart.

  2. Parameters

@vitabaks vitabaks self-assigned this Oct 7, 2024
@vitabaks
Copy link
Owner

vitabaks commented Oct 7, 2024

Thanks for the feedback @glushakov

If possible, attach more doers, error logs, how to reproduce, an example of the fix code that you mentioned. This will help to implement improvements in a shorter time.

Also feel free to offer PR especially for small fixes.

@glushakov
Copy link
Author

For some cases there is no code yet, because they are still in the design stage (for example, transferring parameters from postgresql.base.conf).

Errors, for example for existing tablespace error:


$ mkdir ts/my_tblspc
$ psql
postgres=# create tablespace qwe location '/var/lib/pgsql/ts/my_tblspc';

<..cut ..>

TASK [Print tablespace location] ***************************************************************************************************************************************************************************************************************************************************
skipping: [alma-db2]
ok: [alma-db1] => {
    "pg_tablespace_location.stdout_lines": [
        "/var/lib/pgsql/ts/my_tblspc"
    ]
}

<..cut ..>


TASK [Result of checking the compatibility of the scheme - success] ****************************************************************************************************************************************************************************************************************
ok: [alma-db1] => {
    "msg": "The database schema are compatible with PostgreSQL 16"
}

<..cut..>

TASK [Print the result of the pg_upgrade check] ************************************************************************************************************************************************************************************************************************************
fatal: [alma-db1]: FAILED! => {
    "failed_when_result": true,
    "pg_upgrade_check_result.stdout_lines": [
        "Performing Consistency Checks on Old Live Server",
        "------------------------------------------------",
        "Checking cluster versions                                     ok",
        "Checking database user is the install user                    ok",
        "Checking database connection settings                         ok",
        "Checking for prepared transactions                            ok",
        "Checking for system-defined composite types in user tables    ok",
        "Checking for reg* data types in user tables                   ok",
        "Checking for contrib/isn with bigint-passing mismatch         ok",
        "Checking for incompatible \"aclitem\" data type in user tables  ok",
        "Checking for presence of required libraries                   ok",
        "Checking database user is the install user                    ok",
        "Checking for prepared transactions                            ok",
        "Checking for new cluster tablespace directories               ",
        "new cluster tablespace directory already exists: \"/var/lib/pgsql/ts/my_tblspc/PG_16_202307071\"",
        "Failure, exiting"
    ]
}

The "schema precheck" process created a tablespace in the /var/lib/pgsql/ts/my_tblspc/PG_16_202307071 directory, but then did not delete it.
Then the pg_upgrade --check process detect that this directory already contained elements from the TS and failed.

As i said earlier, my decision is not to use tablespace in pg_dumpall command for schema checks (add --no-tablespaces)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants
@vitabaks @glushakov and others