0
Home  ›  Python

Scrape Data Produk Tokopedia Menggunakan Python

 


Hallo semua kali ini admin akan membahas cara melakukan scrape data pada website tokopedia, atau lebih simpelnya kita akan membuat sebuah codingan menggunkan python untuk dapat mengekstrak data sebuah website, database, aplikasi enterprise, sistem legacy yang kemudian menyimpannya ke dalam sebuah file dengan format tabular atau spreadsheet.

Sebelum memulainya ada beberapa llibary yang mesti kamu install, yaitu 

  1. Pandas
     $py -m pip install pandas
  2. Request
     $py -m pip install requests
  3. Maths
Langkah berikutnya ialah kita aka menggambil API dari tokopedia yang fungsinya untuk melakukan scrape data yang kita ingin cari dan untuk mencari API yang akan digunakan kita membuka fitur dari browser yakni inspect element dan untuk membukanya cukup klik tombol F12, tampilannya seperti di bawah ini.

Kemudian pada bagian network pilih fetch/XHR dan bila tidak muncul seperti pada gambar dibawah ini klik tombol ctrl + r untuk refresh halaman. Lalu pilih searchProductQueryV4 lalu copy as cURL(bash) dan paste file yang tadi ke Visual Studio Code.

Langkah berikutnya ialah membuat progmranya dengan menginport libarynya.
import requests
import pandas as pd
import math
Copy hasil url yang kamu dapat tadi tapi code ini belum bisa di gunakan, untuk itu kamu harus edit terlebih dahulu dengan mengahapus (-H), hapus (\), ubah variable curlnya dan hapus ( — data-raw $),( — compressed). untuk lebih lanjutnya silahkan lihat code yang sudah diubah dan di tambahkan code-codenya. Kamu bisa melihatnya di bawah ini.
curl 'https://gql.tokopedia.com/graphql/SearchProductQueryV4' \
  -H 'authority: gql.tokopedia.com' \
  -H 'accept: */*' \
  -H 'accept-language: id-ID,id;q=0.9,en-US;q=0.8,en;q=0.7' \
  -H 'content-type: application/json' \
  -H 'cookie: _gcl_au=1.1.1413001030.1672228217; _UUID_NONLOGIN_=c1e6d871a598c87993e83a1501ab6f32; DID=c17aa6dcf617cbed0e6fba4639e43a825b415d046ba7852e650167030893d15c7b82e1ff1b4981e31e45c710a91c3b9c; DID_JS=YzE3YWE2ZGNmNjE3Y2JlZDBlNmZiYTQ2MzllNDNhODI1YjQxNWQwNDZiYTc4NTJlNjUwMTY3MDMwODkzZDE1YzdiODJlMWZmMWI0OTgxZTMxZTQ1YzcxMGE5MWMzYjlj47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=; _UUID_CAS_=cf7fbfcb-eb64-4d6c-950c-4f9fa2b9d118; _CASE_=2c75331e33756d656560637b75361e33756d677b753b353b756d751d363c36252336770722243623757b75341e33756d6660617b753b383930756d75757b753b3623756d75757b75271438756d75757b75201e33756d66656566676460627b75241e33756d66666264676260647b7524032e2732756d75653f757b75203f24756d750c2c0b75203625323f38222432083e330b756d66656566676460627b0b75243225213e343208232e27320b756d0b75653f0b757b0b750808232e273239363a320b756d0b75003625323f38222432240b752a7b2c0b75203625323f38222432083e330b756d677b0b75243225213e343208232e27320b756d0b7566623a0b757b0b750808232e273239363a320b756d0b75003625323f38222432240b752a0a757b753b022733756d75656765657a66657a656f03666f6d62666d66657c67606d6767752a; __auc=1457b782185589295917cd3559a; _gcl_aw=GCL.1673789178.CjwKCAiA5Y6eBhAbEiwA_2ZWITpDT3tQihrnx6f12R1Z_SS5v8I4QOajIsLsCgKIHBmbr7AQhXs29hoCq7MQAvD_BwE; _gac_UA-126956641-6=1.1673789184.CjwKCAiA5Y6eBhAbEiwA_2ZWITpDT3tQihrnx6f12R1Z_SS5v8I4QOajIsLsCgKIHBmbr7AQhXs29hoCq7MQAvD_BwE; hfv_banner=true; _gac_UA-9801603-1=1.1673790632.CjwKCAiA5Y6eBhAbEiwA_2ZWITpDT3tQihrnx6f12R1Z_SS5v8I4QOajIsLsCgKIHBmbr7AQhXs29hoCq7MQAvD_BwE; bm_sz=407F36BCEED7397B2796CCBAFF2620BC~YAAQjawwF83rwqSFAQAA3Ot9wBKaKZdbB9Y7jhId4dnxST3HQMpCcM8fw9GzoBRlw3XS7cxHSRdesUmndvzw5MrWXkF8CCBrNf55zkb0irB+X0GSiJ7jlrfQhPJm5pyOJweP7MljCNkGhvoXsFdeALV6hogaciQ8eZszuw5MjLFJqK+8R5TF197k8snHi35W6asNfHIFQFC7QjCJf1Xdscwq6pRiDBDzRMEom0ZJLFA5QH6mxXxyDvweAaxknlCJpuQNulK9uYX3Z9flS2ub/yej6tx838qGRtOrbPZjUFGb35aHKfA=~3618871~3551289; bm_mi=D1ECD6D69DF4FCF2447CA87DD3B36DCC~YAAQjawwF1TtwqSFAQAAT/h9wBKECteQpow38LeC7AF6lJBbNnWwoXOIabtjZ0aYYwP82+DDjKvq/3Iy8aSOF1VpWZ7n5lo43QM/3wcZAfCXDBqOAhKW0DHYl1f+HtfY0BhaSPowXHaRps5Bps7shyzI4xCCJlJ/EjoDPnRNyG+pOO0eR81EEPvKTmnxorPFpI+meB92180YpAbIzuCRR2wlRjXGLrN6hPcn2K639lyRg7OysVGeD0wKRwA6C9qCJOSw+UeOzsBLzN2yasv24UHDzu0KhAMG1aUAX1M7m+Is1/9XgCuF4N2em+EeBT4=~1; _SID_Tokopedia_=FkOk2TYitSctm40z6gGU5d0rvFbmBfnuTekFzj5sfhrNCACYiHTFqsOhAnULdaIF0gKZAObdOYW1rkMnfFmgEAKYkkABG4SwICQ8y9Sn23jxpfLCfE3uUwaGvCYdVDlQ; _gid=GA1.2.954641500.1673971765; ak_bmsc=406E089203B03D88D47767E1E217F9FD~000000000000000000000000000000~YAAQjawwF078wqSFAQAAsXt+wBIT3sGullRUNTQOxWYfvUfnJHjOSUmYmeOSkNZFVgu5iBocwkqC9vouEj5QJpUHI3yfwqLTKiPWk6hPIewTMeRS1zhYRKjWw6m8qAiFKxWwz81KTZULKsQMkgSu7OEyh54SjCA5Kjqq4msj5Fkt8sI7Rxz20JisZ0kTm+ifNDcnGAuwXwEuKv/OwR+zaZOs4V/ZlN057HaxgjYJMJdAQbNyGxaS/IcWjOLyFAJUysP9PuN9U861QbpUoAByFFgk+hbcOPedDfvU78IbfG8QGtI9DLUK9EiM41s+RiNACl8TzDqBYZKkkcmkTPiGo56Ss5iThx1KeRCPHxlzKNy2YQD11hYqkrkcE8Ev8qQoTdZ1uTA8sF+EDUXxsHOd95D12RkWR9j2u5J5TCrJHUVBSElUv0iH1b3e8+5VXsjpzjGwfGhdvYHplzDtdL0kPX611vA7NQamdCiWM2JQUf/lCs799UHZ38bw/3Lx3faUHBHp; AMP_TOKEN=%24NOT_FOUND; _abck=9753AC2451958445CFC616A8A0ACC69F~0~YAAQjawwF7Caw6SFAQAA8z6LwAkFDeCLlOgT11GetPYNLFZ8pAJVzwWYLl1s3cjd6hMuDGbAqmTXgRWFLvkoFiGerB8edHuVSOGZtUrTDK/LF6Ydj4hlOGJn4h5QZAjak06hXhYdeISENSoLneTNBlyxtjNOxDU9jB+AiAIZKtNkmKVp+H4oPVXY6xZ7kPuTlFP/A+hHPlsvCiPtnzM8AGXiTPdMBzGIaA8yePHjNoVPAVzX0YujY4NYNLO79XEnwVkcms5SrCmfpJPMzagoVgKBx3SQUCB3LemfpFFKy/zjxvMy/fcNjifEXO1XWqckmO7sI8jE3eCe/KFJdbArlmt/GQmWz3zQm7DhDiAG0cUM0UMRPWyKEFO54DhWx1MLxnHgMne8MH2Spk4s7U5B6432CfpAxLOFqxYG~-1~-1~-1; __asc=acf830bd185c0a8e39129c5d96b; _dc_gtm_UA-126956641-6=1; _dc_gtm_UA-9801603-1=1; _ga_70947XW48P=GS1.1.1673971765.5.1.1673974635.60.0.0; _ga=GA1.1.1251187054.1672228218' \
  -H 'origin: https://www.tokopedia.com' \
  -H 'referer: https://www.tokopedia.com/search?st=product&q=keybord%20RGB' \
  -H 'sec-ch-ua: "Not_A Brand";v="99", "Google Chrome";v="109", "Chromium";v="109"' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: "Windows"' \
  -H 'sec-fetch-dest: empty' \
  -H 'sec-fetch-mode: cors' \
  -H 'sec-fetch-site: same-site' \
  -H 'tkpd-userid: 0' \
  -H 'user-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36' \
  -H 'x-device: desktop-0.0' \
  -H 'x-source: tokopedia-lite' \
  -H 'x-tkpd-lite-service: zeus' \
  -H 'x-version: bacb7c4' \
  --data-raw $'[{"operationName":"SearchProductQueryV4","variables":{"params":"device=desktop&ob=23&page=1&q=keybord%20RGB&related=true&rows=60&safe_search=false&scheme=https&shipping=&source=search&st=product&start=0&topads_bucket=true&unique_id=c1e6d871a598c87993e83a1501ab6f32&user_addressId=&user_cityId=176&user_districtId=2274&user_id=&user_lat=&user_long=&user_postCode=&user_warehouseId=12210375&variants="},"query":"query SearchProductQueryV4($params: String\u0021) {\\n  ace_search_product_v4(params: $params) {\\n    header {\\n      totalData\\n      totalDataText\\n      processTime\\n      responseCode\\n      errorMessage\\n      additionalParams\\n      keywordProcess\\n      componentId\\n      __typename\\n    }\\n    data {\\n      banner {\\n        position\\n        text\\n        imageUrl\\n        url\\n        componentId\\n        trackingOption\\n        __typename\\n      }\\n      backendFilters\\n      isQuerySafe\\n      ticker {\\n        text\\n        query\\n        typeId\\n        componentId\\n        trackingOption\\n        __typename\\n      }\\n      redirection {\\n        redirectUrl\\n        departmentId\\n        __typename\\n      }\\n      related {\\n        position\\n        trackingOption\\n        relatedKeyword\\n        otherRelated {\\n          keyword\\n          url\\n          product {\\n            id\\n            name\\n            price\\n            imageUrl\\n            rating\\n            countReview\\n            url\\n            priceStr\\n            wishlist\\n            shop {\\n              city\\n              isOfficial\\n              isPowerBadge\\n              __typename\\n            }\\n            ads {\\n              adsId: id\\n              productClickUrl\\n              productWishlistUrl\\n              shopClickUrl\\n              productViewUrl\\n              __typename\\n            }\\n            badges {\\n              title\\n              imageUrl\\n              show\\n              __typename\\n            }\\n            ratingAverage\\n            labelGroups {\\n              position\\n              type\\n              title\\n              url\\n              __typename\\n            }\\n            componentId\\n            __typename\\n          }\\n          componentId\\n          __typename\\n        }\\n        __typename\\n      }\\n      suggestion {\\n        currentKeyword\\n        suggestion\\n        suggestionCount\\n        instead\\n        insteadCount\\n        query\\n        text\\n        componentId\\n        trackingOption\\n        __typename\\n      }\\n      products {\\n        id\\n        name\\n        ads {\\n          adsId: id\\n          productClickUrl\\n          productWishlistUrl\\n          productViewUrl\\n          __typename\\n        }\\n        badges {\\n          title\\n          imageUrl\\n          show\\n          __typename\\n        }\\n        category: departmentId\\n        categoryBreadcrumb\\n        categoryId\\n        categoryName\\n        countReview\\n        customVideoURL\\n        discountPercentage\\n        gaKey\\n        imageUrl\\n        labelGroups {\\n          position\\n          title\\n          type\\n          url\\n          __typename\\n        }\\n        originalPrice\\n        price\\n        priceRange\\n        rating\\n        ratingAverage\\n        shop {\\n          shopId: id\\n          name\\n          url\\n          city\\n          isOfficial\\n          isPowerBadge\\n          __typename\\n        }\\n        url\\n        wishlist\\n        sourceEngine: source_engine\\n        __typename\\n      }\\n      violation {\\n        headerText\\n        descriptionText\\n        imageURL\\n        ctaURL\\n        ctaApplink\\n        buttonText\\n        buttonType\\n        __typename\\n      }\\n      __typename\\n    }\\n    __typename\\n  }\\n}\\n"}]' \
  --compressed
Untuk code perubahannya kamu bisa lihat di bawah ini.
url_target =  'https://gql.tokopedia.com/graphql/SearchProductQueryV4'
header = {'authority': 'gql.tokopedia.com',
   'accept': '*/*',
   'accept-language': 'en-US,en;q=0.9',
   'cache-control': 'no-cache',
   'content-type': 'application/json',
   'cookie': '_gcl_au=1.1.742344446.1668161227; DID=af3dc19da02ce29f6a0b5b42a0c49298cde2ceff9735aed91a4ce0ac9fa5c43bef01c3977563b91e5eb1ab63e1cd5577; DID_JS=YWYzZGMxOWRhMDJjZTI5ZjZhMGI1YjQyYTBjNDkyOThjZGUyY2VmZjk3MzVhZWQ5MWE0Y2UwYWM5ZmE1YzQzYmVmMDFjMzk3NzU2M2I5MWU1ZWIxYWI2M2UxY2Q1NTc347DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=; _UUID_NONLOGIN_=f419ede0ae33a45c36cd442f6b801712; _jxx=66c12850-e572-11ec-bda2-4f63525eb407; _jx=66c12850-e572-11ec-bda2-4f63525eb407; _UUID_CAS_=a6e9168d-d9ef-4275-a19a-a74ec4995d41; _CASE_=7d24624f62243c343431322a24674f62243c362a246a646a243c244c676d67747267265673756772242a24654f62243c3731302a246a696861243c24242a246a6772243c24242a24764569243c24242a24714f62243c37343437363531332a24754f62243c37373335363331352a2475527f7663243c24346e242a24716e75243c245d7d5a24716774636e69737563596f625a243c37343437363531332a5a24756374706f656359727f76635a243c5a24346e5a242a5a245959727f766368676b635a243c5a24516774636e69737563755a247b2a7d5a24716774636e69737563596f625a243c362a5a24756374706f656359727f76635a243c5a2437336b5a242a5a245959727f766368676b635a243c5a24516774636e69737563755a247b5b242a246a537662243c24343634342b37372b37375237313c36313c363f2d36313c3636247b; __auc=5794a37d18466286071c95daa13; hfv_banner=true; _gid=GA1.2.1854659675.1668649242; bm_sz=05EEAD4804A6F8E6BAB6767459CA6E76~YAAQzOwZuM4gTIKEAQAAT2jAiRH7AJio4on1JpPx7bHwuPi3ZUgJYjnOJiMEx32J9WxatoY7KLMZTLbgltZT8sZtRLY9O0dAliIfxxlamkfymIUR7cfTyQ3VYMvwBWUeGSJuDXpG+4GJydoef7VURIFsW5/Iu8NbCxMqkptk4e9dO0l4OGRgq/ogA1ohBfK7kUP7DBTDVoyjZCd6U88sRRUCDAsqNuXLlnkAhHudR7oZ+IL95uuK0cD3jFJ8Pu8NqntBPXRLPRvWA/inqRdfD9UNELFQ6z1X8Ha5XImYQTntFnmXGnA=~3617079~3356225; bm_mi=91038649E3C89AF53C6A2BD5516B728D~YAAQzOwZuAcjTIKEAQAApX/AiRHtd8wP/Z3ssr0KtY6iTHfx8CVHSATWfqPKx8WU+hm1IK6/tBzIc8p+a0Pp/Hcgmhycfg03PskA91NMaQSxkuG/cfj8EnnWOrKi4oHPSdxtfpaPKPP9hnbCY7m3rtxGnQNeL2n+D6magGRnkNLU/foQTl3PmvmXRpK524dAYJzwCujJieOqlYFD+KGEYdLC1I0t6Afe93KMbhYZJ/igzOTDMGMoJ/uxBUPWxRaLFPL1t/r+db8wVQ/h36DhuOGNYpStr6pWtG6D6MwzEL9NA9wR74h39jBO6ghCePI3hQ==~1; _abck=B2D8A04A236B2AA069C87B3A5DE321BE~0~YAAQzOwZuKQjTIKEAQAAWYjAiQjIRiB3dF+bimVc5Mm8SWSKjbyMy+u8Qe3wn/wtzCTAyg7+cJY2MzeF1y7whvmc4xzG2Zz2wTF/42HhRGBX9pzP617M3lpbPMDdFFM46BpUOgt50moIwObnZmuLp1S9cGWRSslLIJqHI1BeE/AHcqZpWnxqEB8E+hQj+scUWx2Yyw+Z1VeC2aHeIHBmMxOgU/PXKnXL95SRGQwRGHXeYEOsLCzBPEEQntvNxJ9T+vtg4l9+hl+Iiv7GoXrYryD/4iWYDgG0VxA7TCuFnyF2I+w3dwvi2osDA6wCTcF99k2yuYhe1lVlEN8GxRBwVvkuuE76fw1KgHnnbyNm0NZgTZ35DsfSExPLVXcqXV/rcrhRRos943/KPBdYo0p2IowoUmqTYi89J+0=~-1~-1~-1; _SID_Tokopedia_=cyVbwbapm6sAFa2KyDu5FzA0d-3pXu96LerPKwMyvB6_HxtpdgrJq0RCB0tEAcQAn5S0hDVMmUS3FLWzrv_w7DlaFkdsUS-Ti6vDgSfqFcGrNmgl4JkMGDlTYVL-_mHL; _jxxs=1668758411-66c12850-e572-11ec-bda2-4f63525eb407; _jxs=1668758411-66c12850-e572-11ec-bda2-4f63525eb407; __asc=5e24a02c18489c09f429400be92; ak_bmsc=4E453508B75C49E6359C52BCB9E15374~000000000000000000000000000000~YAAQzOwZuIglTIKEAQAAlajAiRFZTo7ufYKAGvp8UEIf5jwbVBLx7+dcB2RH/CIjZsgCAi4V2rXa30kvDmVuF/KwhpONRAO5GfqiccJy4zSZxKwGOVoOxjEKtiiQdstgCQEx+wJNOGPrp+54yUUuMSwu6V1iXiheFfuPhMB5EDl6RjUx01r47Pd5F3yB/Le6R51T0HoMMKLoEO3f9yyCNt7WFE+2lZy5iq8YmdzRqdoXxWcuZfwp10TrOWySO5tiRZ7vM+7nGQJDX3uUT37lG5N2btaKwQVYNpR5p03faqopU6iPyZmFmf5oXygnLdKlqJIrXCpU5WtfFfdsDcDfWt1FtRV1H7i4aEgN7cqQRwYSVUq7fBToWdsF7dXdNfGIsXBVP8ToORdXgnTVpGVjwO84sDjiPI1y4dFlysZ5egzFKnlEnaO33MfjU1MH1r6/F2kfOBWQGyjNyZz++Gjq7ivVSA1NVz+ScoSGESc9Tn4g2FIvqx2NOiSs6LJ+4BV6jDkrHJg=; _dc_gtm_UA-126956641-6=1; _ga_70947XW48P=GS1.1.1668758410.12.1.1668758574.57.0.0; AMP_TOKEN=%24NOT_FOUND; _ga=GA1.2.824419592.1668161229; _dc_gtm_UA-9801603-1=1',
   'origin': 'https://www.tokopedia.com',
   'pragma': 'no-cache',
   'referer': 'https://www.tokopedia.com/search?st=product&q=baju%20anak%20perempuan&srp_component_id=02.01.00.00&srp_page_id=&srp_page_title=&navsource=',
   'sec-ch-ua': '"Google Chrome";v="107", "Chromium";v="107", "Not=A?Brand";v="24"',
   'sec-ch-ua-mobile': '?0',
   'sec-ch-ua-platform': '"Linux"',
   'sec-fetch-dest': 'empty',
   'sec-fetch-mode': 'cors',
   'sec-fetch-site': 'same-site',
   'tkpd-userid': '0',
   'user-agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36',
   'x-device': 'desktop-0.0',
   'x-source': 'tokopedia-lite',
   'x-tkpd-lite-service': 'zeus',
   'x-version': '1fbf287'}

def cek_jumlah_data(kata_kunci):
    init_query = f'[{{"operationName":"SearchProductQueryV4","variables":{{"params":"device=desktop&navsource=&ob=23&page=1&q={kata_kunci}&related=true&rows=60&safe_search=false&scheme=https&shipping=&source=search&srp_component_id=01.07.00.00&srp_page_id=&srp_page_title=&st=product&start=0&topads_bucket=true&unique_id=1fdabea77fbaf5f1954bdbc40d4a9337&user_addressId=113228966&user_cityId=171&user_districtId=2233&user_id=7773903&user_lat=-6.377643399999999&user_long=106.7621449&user_postCode=16516&user_warehouseId=0&variants="}},"query":"query SearchProductQueryV4($params: String!) {{\\n  ace_search_product_v4(params: $params) {{\\n    header {{\\n      totalData\\n      totalDataText\\n      processTime\\n      responseCode\\n      errorMessage\\n      additionalParams\\n      keywordProcess\\n      componentId\\n      __typename\\n    }}\\n    data {{\\n      banner {{\\n        position\\n        text\\n        imageUrl\\n        url\\n        componentId\\n        trackingOption\\n        __typename\\n      }}\\n      backendFilters\\n      isQuerySafe\\n      ticker {{\\n        text\\n        query\\n        typeId\\n        componentId\\n        trackingOption\\n        __typename\\n      }}\\n      redirection {{\\n        redirectUrl\\n        departmentId\\n        __typename\\n      }}\\n      related {{\\n        position\\n        trackingOption\\n        relatedKeyword\\n        otherRelated {{\\n          keyword\\n          url\\n          product {{\\n            id\\n            name\\n            price\\n            imageUrl\\n            rating\\n            countReview\\n            url\\n            priceStr\\n            wishlist\\n            shop {{\\n              city\\n              isOfficial\\n              isPowerBadge\\n              __typename\\n            }}\\n            ads {{\\n              adsId: id\\n              productClickUrl\\n              productWishlistUrl\\n              shopClickUrl\\n              productViewUrl\\n              __typename\\n            }}\\n            badges {{\\n              title\\n              imageUrl\\n              show\\n              __typename\\n            }}\\n            ratingAverage\\n            labelGroups {{\\n              position\\n              type\\n              title\\n              url\\n              __typename\\n            }}\\n            componentId\\n            __typename\\n          }}\\n          componentId\\n          __typename\\n        }}\\n        __typename\\n      }}\\n      suggestion {{\\n        currentKeyword\\n        suggestion\\n        suggestionCount\\n        instead\\n        insteadCount\\n        query\\n        text\\n        componentId\\n        trackingOption\\n        __typename\\n      }}\\n      products {{\\n        id\\n        name\\n        ads {{\\n          adsId: id\\n          productClickUrl\\n          productWishlistUrl\\n          productViewUrl\\n          __typename\\n        }}\\n        badges {{\\n          title\\n          imageUrl\\n          show\\n          __typename\\n        }}\\n        category: departmentId\\n        categoryBreadcrumb\\n        categoryId\\n        categoryName\\n        countReview\\n        customVideoURL\\n        discountPercentage\\n        gaKey\\n        imageUrl\\n        labelGroups {{\\n          position\\n          title\\n          type\\n          url\\n          __typename\\n        }}\\n        originalPrice\\n        price\\n        priceRange\\n        rating\\n        ratingAverage\\n        shop {{\\n          shopId: id\\n          name\\n          url\\n          city\\n          isOfficial\\n          isPowerBadge\\n          __typename\\n        }}\\n        url\\n        wishlist\\n        sourceEngine: source_engine\\n        __typename\\n      }}\\n      violation {{\\n        headerText\\n        descriptionText\\n        imageURL\\n        ctaURL\\n        ctaApplink\\n        buttonText\\n        buttonType\\n        __typename\\n      }}\\n      __typename\\n    }}\\n    __typename\\n  }}\\n}}\\n"}}]'
    
    response = requests.post(url_target, headers=header, data=init_query)

    #Jika ingin scrape seluruh data, maka buka remark yang ini
    jumlah_data = response.json()[0]['data']['ace_search_product_v4']['header']['totalData']
    jumlah_page = math.ceil(jumlah_data/60) + 1
    
    return jumlah_data, jumlah_page

def scrape_tokeped(kata_kunci):
    print("Mulai scrape data ke tokopedia....")
    jml_data, jml_page = cek_jumlah_data(kata_kunci)
    hasil = []
    for page, data in zip(range(1, jml_page), range(0, jml_data, 60)):
        print(page)
        query = f'[{{"operationName":"SearchProductQueryV4","variables":{{"params":"device=desktop&navsource=&ob=23&page={page}&q={kata_kunci}&related=true&rows=60&safe_search=false&scheme=https&shipping=&source=search&srp_component_id=01.07.00.00&srp_page_id=&srp_page_title=&st=product&start={data}&topads_bucket=true&unique_id=3220fd80a9a96a8eb398771a986004aa&user_addressId=&user_cityId=176&user_districtId=2274&user_id=&user_lat=&user_long=&user_postCode=&user_warehouseId=12210375&variants="}},"query":"query SearchProductQueryV4($params: String!) {{\\n  ace_search_product_v4(params: $params) {{\\n    header {{\\n      totalData\\n      totalDataText\\n      processTime\\n      responseCode\\n      errorMessage\\n      additionalParams\\n      keywordProcess\\n      componentId\\n      __typename\\n    }}\\n    data {{\\n      banner {{\\n        position\\n        text\\n        imageUrl\\n        url\\n        componentId\\n        trackingOption\\n        __typename\\n      }}\\n      backendFilters\\n      isQuerySafe\\n      ticker {{\\n        text\\n        query\\n        typeId\\n        componentId\\n        trackingOption\\n        __typename\\n      }}\\n      redirection {{\\n        redirectUrl\\n        departmentId\\n        __typename\\n      }}\\n      related {{\\n        position\\n        trackingOption\\n        relatedKeyword\\n        otherRelated {{\\n          keyword\\n          url\\n          product {{\\n            id\\n            name\\n            price\\n            imageUrl\\n            rating\\n            countReview\\n            url\\n            priceStr\\n            wishlist\\n            shop {{\\n              city\\n              isOfficial\\n              isPowerBadge\\n              __typename\\n            }}\\n            ads {{\\n              adsId: id\\n              productClickUrl\\n              productWishlistUrl\\n              shopClickUrl\\n              productViewUrl\\n              __typename\\n            }}\\n            badges {{\\n              title\\n              imageUrl\\n              show\\n              __typename\\n            }}\\n            ratingAverage\\n            labelGroups {{\\n              position\\n              type\\n              title\\n              url\\n              __typename\\n            }}\\n            componentId\\n            __typename\\n          }}\\n          componentId\\n          __typename\\n        }}\\n        __typename\\n      }}\\n      suggestion {{\\n        currentKeyword\\n        suggestion\\n        suggestionCount\\n        instead\\n        insteadCount\\n        query\\n        text\\n        componentId\\n        trackingOption\\n        __typename\\n      }}\\n      products {{\\n        id\\n        name\\n        ads {{\\n          adsId: id\\n          productClickUrl\\n          productWishlistUrl\\n          productViewUrl\\n          __typename\\n        }}\\n        badges {{\\n          title\\n          imageUrl\\n          show\\n          __typename\\n        }}\\n        category: departmentId\\n        categoryBreadcrumb\\n        categoryId\\n        categoryName\\n        countReview\\n        customVideoURL\\n        discountPercentage\\n        gaKey\\n        imageUrl\\n        labelGroups {{\\n          position\\n          title\\n          type\\n          url\\n          __typename\\n        }}\\n        originalPrice\\n        price\\n        priceRange\\n        rating\\n        ratingAverage\\n        shop {{\\n          shopId: id\\n          name\\n          url\\n          city\\n          isOfficial\\n          isPowerBadge\\n          __typename\\n        }}\\n        url\\n        wishlist\\n        sourceEngine: source_engine\\n        __typename\\n      }}\\n      violation {{\\n        headerText\\n        descriptionText\\n        imageURL\\n        ctaURL\\n        ctaApplink\\n        buttonText\\n        buttonType\\n        __typename\\n      }}\\n      __typename\\n    }}\\n    __typename\\n  }}\\n}}\\n"}}]'
        response = requests.post(url_target, headers=header, data=query)
        products = response.json()[0]['data']['ace_search_product_v4']['data']['products']
        hasil.extend(products)
    
    dtFrame = pd.DataFrame.from_dict(hasil)
    dtFrame.to_csv('data_tokped_2.csv', encoding='utf-8')
    print("Selesai ...")

keyword = "keybord RGB"
scrape_tokeped(keyword)
Pada scrape di atas kita akan melakukan scrape dengan keyword rgb, kemudian hasilnya akan di ubah ke dalam bentuk csv

./X3NUX
Mereka teralu banyak mengusik, maka bunuh dan dan matilah sekarang juga!!!!
Post a Comment
Search
Menu
Theme
Share
Additional JS