Import Data
This document describes how to import data into a TiDB cluster in Kubernetes using TiDB Lightning.
TiDB Lightning contains two components: tidb-lightning and tikv-importer. In Kubernetes, the tikv-importer is inside the separate Helm chart of the TiDB cluster. And tikv-importer is deployed as a StatefulSet
with replicas=1
while tidb-lightning is in a separate Helm chart and deployed as a Job
.
TiDB Lightning supports three backends: Importer-backend
, Local-backend
, and TiDB-backend
. For the differences of these backends and how to choose backends, see TiDB Lightning Backends.
- For
Importer-backend
, both tikv-importer and tidb-lightning need to be deployed. - For
Local-backend
, only tidb-lightning needs to be deployed. - For
TiDB-backend
, only tidb-lightning needs to be deployed, and it is recommended to import data using CustomResourceDefinition (CRD) in TiDB Operator v1.1 and later versions. For details, refer to Restore Data from GCS Using TiDB Lightning or Restore Data from S3-Compatible Storage Using TiDB Lightning
Deploy TiKV Importer​
note
If you use the local
or tidb
backend for data restore, you can skip deploying tikv-importer and deploy tidb-lightning directly.
You can deploy tikv-importer using the Helm chart. See the following example:
Make sure that the PingCAP Helm repository is up to date:
helm repo update
helm search repo tikv-importer -l
Get the default
values.yaml
file for easier customization:helm inspect values pingcap/tikv-importer --version=${chart_version} > values.yaml
Modify the
values.yaml
file to specify the target TiDB cluster. See the following example:clusterName: demo
image: pingcap/tidb-lightning:v5.3.0
imagePullPolicy: IfNotPresent
storageClassName: local-storage
storage: 20Gi
pushgatewayImage: prom/pushgateway:v0.3.1
pushgatewayImagePullPolicy: IfNotPresent
config: |
log-level = "info"
[metric]
job = "tikv-importer"
interval = "15s"
address = "localhost:9091"clusterName
must match the target TiDB cluster.If the target TiDB cluster has enabled TLS between components (
spec.tlsCluster.enabled: true
), refer to Generate certificates for components of the TiDB cluster to genereate a server-side certificate for TiKV Importer, and configuretlsCluster.enabled: true
invalues.yaml
to enable TLS.Deploy tikv-importer:
helm install ${cluster_name} pingcap/tikv-importer --namespace=${namespace} --version=${chart_version} -f values.yaml
Note:
You must deploy tikv-importer in the same namespace where the target TiDB cluster is deployed.
Deploy TiDB Lightning​
Configure TiDB Lightning​
Use the following command to get the default configuration of TiDB Lightning:
helm inspect values pingcap/tidb-lightning --version=${chart_version} > tidb-lightning-values.yaml
Configure a backend
used by TiDB Lightning depending on your needs. To do that, you can set the backend
value in values.yaml
to an option in importer
, local
, or tidb
.
note
If you use the local
backend, you must set sortedKV
in values.yaml
to create the corresponding PVC. The PVC is used for local KV sorting.
Starting from v1.1.10, the tidb-lightning Helm chart saves the TiDB Lightning checkpoint information in the directory of the source data. When the a new tidb-lightning job is running, it can resume the data import according to the checkpoint information.
For versions earlier than v1.1.10, you can modify config
in values.yaml
to save the checkpoint information in the target TiDB cluster, other MySQL-compatible databases or a shared storage directory. For more information, refer to TiDB Lightning checkpoint.
If TLS between components has been enabled on the target TiDB cluster (spec.tlsCluster.enabled: true
), refer to Generate certificates for components of the TiDB cluster to genereate a server-side certificate for TiDB Lightning, and configure tlsCluster.enabled: true
in values.yaml
to enable TLS between components.
If the target TiDB cluster has enabled TLS for the MySQL client (spec.tidb.tlsClient.enabled: true
), and the corresponding client-side certificate is configured (the Kubernetes Secret object is ${cluster_name}-tidb-client-secret
), you can configure tlsClient.enabled: true
in values.yaml
to enable TiDB Lightning to connect to the TiDB server using TLS.
To use different client certificates to connect to the TiDB server, refer to Issue two sets of certificates for the TiDB cluster to generate the client-side certificate for TiDB Lightning, and configure the corresponding Kubernetes secret object in tlsCluster.tlsClientSecretName
in values.yaml
.
note
If TLS is enabled between components via tlsCluster.enabled: true
but not enabled between TiDB Lightning and the TiDB server via tlsClient.enabled: true
, you need to explicitly disable TLS between TiDB Lightning and the TiDB server in config
in values.yaml
:
[tidb]
tls="false"
TiDB Lightning Helm chart supports both local and remote data sources.
Local​
In the local mode, the backup data must be on one of the Kubernetes node. To enable this mode, set dataSource.local.nodeName
to the node name and dataSource.local.hostPath
to the path of the backup data. The path should contain a file named metadata
.
Remote​
Unlike the local mode, the remote mode needs to use rclone to download the backup tarball file from a network storage to a PV. Any cloud storage supported by rclone should work, but currently only the following have been tested: Google Cloud Storage (GCS), Amazon S3, Ceph Object Storage.
To restore backup data from the remote source, take the following steps:
Make sure that
dataSource.local.nodeName
anddataSource.local.hostPath
invalues.yaml
are commented out.Grant permissions to the remote storage
If you use Amazon S3 as the storage, refer to AWS account Permissions. The configuration varies with different methods.
If you use Ceph as the storage, you can only grant permissions by importing AccessKey and SecretKey. See Grant permissions by AccessKey and SecretKey.
If you use GCS as the storage, refer to GCS account permissions.
Grant permissions by importing AccessKey and SecretKey
Create a
Secret
configuration filesecret.yaml
containing the rclone configuration. A sample configuration is listed below. Only one cloud storage configuration is required.apiVersion: v1
kind: Secret
metadata:
name: cloud-storage-secret
type: Opaque
stringData:
rclone.conf: |
[s3]
type = s3
provider = AWS
env_auth = false
access_key_id = ${access_key}
secret_access_key = ${secret_key}
region = us-east-1
[ceph]
type = s3
provider = Ceph
env_auth = false
access_key_id = ${access_key}
secret_access_key = ${secret_key}
endpoint = ${endpoint}
region = :default-placement
[gcs]
type = google cloud storage
# The service account must include Storage Object Viewer role
# The content can be retrieved by `cat ${service-account-file} | jq -c .`
service_account_credentials = ${service_account_json_file_content}Execute the following command to create
Secret
:kubectl apply -f secret.yaml -n ${namespace}
Grant permissions by associating IAM with Pod or with ServiceAccount
If you use Amazon S3 as the storage, you can grant permissions by associating IAM with Pod or with ServiceAccount, in which
s3.access_key_id
ands3.secret_access_key
can be ignored.Save the following configurations as
secret.yaml
.apiVersion: v1
kind: Secret
metadata:
name: cloud-storage-secret
type: Opaque
stringData:
rclone.conf: |
[s3]
type = s3
provider = AWS
env_auth = true
access_key_id =
secret_access_key =
region = us-east-1Execute the following command to create
Secret
:kubectl apply -f secret.yaml -n ${namespace}
Configure the
dataSource.remote.storageClassName
to an existing storage class in the Kubernetes cluster.
Ad hoc​
When restoring data from remote storage, sometimes the restore process is interrupted due to the exception. In such cases, if you do not want to download backup data from the network storage repeatedly, you can use the ad hoc mode to directly recover the data that has been downloaded and decompressed into PV in the remote mode. The steps are as follows:
Ensure
dataSource.local
anddataSource.remote
in the config filevalues.yaml
are empty。Configure
dataSource.adhoc.pvcName
invalues.yaml
to the PVC name used in restoring data from remote storage.Configure
dataSource.adhoc.backupName
invalues.yaml
to the name of the original backup data, such as:backup-2020-12-17T10:12:51Z
(Do not contain the '. tgz' suffix of the compressed file name on network storage).
Deploy TiDB Lightning​
The method of deploying TiDB Lightning varies with different methods of granting permissions and with different storages.
For Local Mode, Ad hoc Mode, and Remote Mode (only for remote modes that meet one of the three requirements: using Amazon S3 AccessKey and SecretKey permission granting methods, using Ceph as the storage backend, or using GCS as the storage backend), run the following command to deploy TiDB Lightning.
helm install ${release_name} pingcap/tidb-lightning --namespace=${namespace} --set failFast=true -f tidb-lightning-values.yaml --version=${chart_version}
For Remote Mode, if you grant permissions by associating Amazon S3 IAM with Pod, take the following steps:
Create the IAM role:
Create an IAM role for the account, and grant the required permission to the role. The IAM role requires the
AmazonS3FullAccess
permission because TiDB Lightning needs to access Amazon S3 storage.Modify
tidb-lightning-values.yaml
, and add theiam.amazonaws.com/role: arn:aws:iam::123456789012:role/user
annotation in theannotations
field.Deploy TiDB Lightning:
helm install ${release_name} pingcap/tidb-lightning --namespace=${namespace} --set failFast=true -f tidb-lightning-values.yaml --version=${chart_version}
Note:
arn:aws:iam::123456789012:role/user
is the IAM role created in Step 1.
For Remote Mode, if you grant permissions by associating Amazon S3 with ServiceAccount, take the following steps:
Enable the IAM role for the service account on the cluster:
To enable the IAM role permission on the EKS cluster, see AWS Documentation.
Create the IAM role:
Create an IAM role. Grant the
AmazonS3FullAccess
permission to the role, and editTrust relationships
of the role.Associate IAM with the ServiceAccount resources:
kubectl annotate sa ${servieaccount} -n eks.amazonaws.com/role-arn=arn:aws:iam::123456789012:role/user
Deploy TiDB Lightning:
helm install ${release_name} pingcap/tidb-lightning --namespace=${namespace} --set-string failFast=true,serviceAccount=${servieaccount} -f tidb-lightning-values.yaml --version=${chart_version}
Note:
arn:aws:iam::123456789012:role/user
is the IAM role created in Step 1.${service-account}
is the ServiceAccount used by TiDB Lightning. The default value isdefault
.
Destroy TiKV Importer and TiDB Lightning​
Currently, TiDB Lightning only supports restoring data offline. After the restore, if the TiDB cluster needs to provide service for external applications, you can destroy TiDB Lightning to save cost.
To destroy tikv-importer, execute the following command:
helm uninstall ${release_name} -n ${namespace}
To destroy tidb-lightning, execute the following command:
helm uninstall ${release_name} -n ${namespace}
Troubleshoot TiDB Lightning​
When TiDB Lightning fails to restore data, you cannot simply restart it. Manual intervention is required. Therefore, the TiDB Lightning's Job
restart policy is set to Never
.
note
If you have not configured to persist the checkpoint information in the target TiDB cluster, other MySQL-compatible databases or a shared storage directory, after the restore failure, you need to first delete the part of data already restored to the target cluster. After that, deploy tidb-lightning again and retry the data restore.
If TiDB Lightning fails to restore data, and if you have configured to persist the checkpoint information in the target TiDB cluster, other MySQL-compatible databases or a shared storage directory, follow the steps below to do manual intervention:
View the log by executing the following command:
kubectl logs -n ${namespace} ${pod_name}
If you restore data using the remote data source, and the error occurs when TiDB Lightning downloads data from remote storage:
- Address the problem according to the log.
- Deploy tidb-lightning again and retry the data restore.
For other cases, refer to the following steps.
Refer to TiDB Lightning Troubleshooting and learn the solutions to different issues.
Address the issues accordingly:
If
tidb-lightning-ctl
is required:Configure
dataSource
invalues.yaml
. Make sure the newJob
uses the data source and checkpoint information of the failedJob
:- In the local or ad hoc mode, you do not need to modify
dataSource
. - In the remote mode, modify
dataSource
to the ad hoc mode.dataSource.adhoc.pvcName
is the PVC name created by the original Helm chart.dataSource.adhoc.backupName
is the backup name of the data to be restored.
- In the local or ad hoc mode, you do not need to modify
Modify
failFast
invalues.yaml
tofalse
, and create aJob
used fortidb-lightning-ctl
.- Based on the checkpoint information, TiDB Lightning checks whether the last data restore encountered an error. If yes, TiDB Lightning pauses the restore automatically.
- TiDB Lightning uses the checkpoint information to avoid repeatedly restoring the same data. Therefore, creating the
Job
does not affect data correctness.
After the Pod corresponding to the new
Job
is running, view the log by runningkubectl logs -n ${namespace} ${pod_name}
and confirm tidb-lightning in the newJob
already stops data restore. If the log has the following message, the data restore is stopped:tidb lightning encountered error
tidb lightning exit
Enter the container by running
kubectl exec -it -n ${namespace} ${pod_name} -it -- sh
.Obtain the starting script by running
cat /proc/1/cmdline
.Get the command-line parameters from the starting script. Refer to TiDB Lightning Troubleshooting and troubleshoot using
tidb-lightning-ctl
.After the troubleshooting, modify
failFast
invalues.yaml
totrue
and create a newJob
to resume data restore.
If
tidb-lightning-ctl
is not required:Configure
dataSource
invalues.yaml
. Make sure the newJob
uses the data source and checkpoint information of the failedJob
:- In the local or ad hoc mode, you do not need to modify
dataSource
. - In the remote mode, modify
dataSource
to the ad hoc mode.dataSource.adhoc.pvcName
is the PVC name created by the original Helm chart.dataSource.adhoc.backupName
is the backup name of the data to be restored.
- In the local or ad hoc mode, you do not need to modify
Create a new
Job
using the modifiedvalues.yaml
file and resume data restore.
After the troubleshooting and data restore is completed, delete the
Job
s for data restore and troubleshooting.