{"id":763,"date":"2024-07-10T16:42:00","date_gmt":"2024-07-10T14:42:00","guid":{"rendered":"https:\/\/jankiewicz.pl\/?p=763"},"modified":"2025-04-13T15:17:51","modified_gmt":"2025-04-13T13:17:51","slug":"apache-spark-dla-kursu-data-science-pro-ai","status":"publish","type":"post","link":"https:\/\/jankiewicz.pl\/index.php\/apache-spark-dla-kursu-data-science-pro-ai\/","title":{"rendered":"Apache Spark dla kursu Data Science PRO + AI"},"content":{"rendered":"\n<h2 class=\"wp-block-heading\">Harmonogram<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"#day1\" data-type=\"internal\" data-id=\"#day1\">Dzie\u0144 1<\/a><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Big Data &#8211; wprowadzenie <\/li>\n\n\n\n<li>Spark &#8211; wprowadzenie<\/li>\n\n\n\n<li>Spark &#8211; DataFrame API &#8211; cz\u0119\u015b\u0107 1<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><a href=\"#day2\" data-type=\"internal\" data-id=\"#day2\">Dzie\u0144 2<\/a><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Spark SQL &#8211; Dataframe API &#8211; cz\u0119\u015b\u0107 2<\/li>\n\n\n\n<li>Spark SQL &#8211; SQL API<\/li>\n\n\n\n<li>Spark ML<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Wprowadzenie<\/h2>\n\n\n<div class=\"wp-block-image\">\n<figure class=\"alignright size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"852\" height=\"1024\" src=\"https:\/\/jankiewicz.pl\/wp-content\/uploads\/2023\/11\/image-852x1024.png\" alt=\"\" class=\"wp-image-732\" style=\"width:205px;height:auto\" srcset=\"https:\/\/jankiewicz.pl\/wp-content\/uploads\/2023\/11\/image-852x1024.png 852w, https:\/\/jankiewicz.pl\/wp-content\/uploads\/2023\/11\/image-250x300.png 250w, https:\/\/jankiewicz.pl\/wp-content\/uploads\/2023\/11\/image-768x923.png 768w, https:\/\/jankiewicz.pl\/wp-content\/uploads\/2023\/11\/image.png 944w\" sizes=\"auto, (max-width: 852px) 100vw, 852px\" \/><\/figure>\n<\/div>\n\n\n<p>Apache Spark okre\u015blany jest mianem standardu de-facto przetwarzania Big Data. <br>Jego popularno\u015b\u0107, wyst\u0119powanie praktycznie w ramach ka\u017cdej z platform chmurowych, w \u015brodowiskach <em>on-premise<\/em>, API dla j\u0119zyk\u00f3w Scala, Java, ale tak\u017ce Python i R oraz przede wszystkim zakres dostarczanej funkcjonalno\u015bci pe\u0142ni uzasadnia to twierdzenie. <\/p>\n\n\n\n<p>Jeszcze niedawno dokumentacja Apache Spark przyk\u0142adowe fragmenty kodu w ramach dostarczanych API prezentowa\u0142a w kolejno\u015bci: Scala, Java, Python R.<br>Od wersji 3.5.0 ta kolejno\u015b\u0107 jest ju\u017c inna: Python, Scala, Java, R. Popularno\u015b\u0107 j\u0119zyka Python robi swoje, znaczenie dla \u015bwiata Data Science to jeszcze pot\u0119guje.<\/p>\n\n\n\n<p>Materia\u0142y dost\u0119pne w ramach kursu <a href=\"https:\/\/datasciencepro.kodolamacz.pl\/\" data-type=\"link\" data-id=\"https:\/\/datasciencepro.kodolamacz.pl\/\" target=\"_blank\" rel=\"noreferrer noopener\">Data Science PRO + AI<\/a><\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Podstawowe cele szkolenia<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Zapoznanie z platform\u0105 Spark oraz jej API dla j\u0119zyka Python<\/li>\n\n\n\n<li>Wykorzystanie Apache Spark w r\u00f3\u017cnych przypadkach analizy du\u017cej ilo\u015bci danych<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">G\u0142\u00f3wne jego zalety<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Kompleksowe wprowadzenie do platformy Spark &#8211; po zako\u0144czonym szkoleniu znasz mo\u017cliwo\u015bci i zakres funkcjonalno\u015bci Sparka.<\/li>\n\n\n\n<li>Przedstawienie praktycznych przyk\u0142ad\u00f3w oraz praktyk zwi\u0105zanych z analiz\u0105 du\u017cej ilo\u015bci danych<\/li>\n\n\n\n<li>Praktyka przed teori\u0105 &#8211; nie tylko wiesz jak, ale tak\u017ce dlaczego<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\">Wymagania<\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Dobra znajomo\u015b\u0107: j\u0119zyka SQL, relacyjnego modelu danych oraz hurtowni danych<\/li>\n\n\n\n<li>Podstawowa znajomo\u015b\u0107 j\u0119zyka programowania Python<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Struktura kursu<\/h2>\n\n\n\n<p>Kurs podzielony jest na tematy. Ka\u017cdy z temat\u00f3w obejmuje wyk\u0142adowy materia\u0142 wprowadzaj\u0105cy i zestaw \u0107wicze\u0144\/zada\u0144\/tutoriali, kt\u00f3ry w praktyczny spos\u00f3b pozwala zaznajomi\u0107 si\u0119 z przedstawianym tematem.<\/p>\n\n\n\n<p>Materia\u0142 <strong>wyk\u0142adowy<\/strong> ilustrowany jest slajdami z du\u017c\u0105 liczb\u0105 przyk\u0142ad\u00f3w.<\/p>\n\n\n\n<p>Materia\u0142 <strong>praktyczny<\/strong> ma charakter zada\u0144\/warsztat\u00f3w\/tutoriali do samodzielnego wykonania.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Lista temat\u00f3w<\/h2>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"day1\">Dzie\u0144 1<\/h3>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li>Wprowadzenie\n<ul class=\"wp-block-list\">\n<li>Materia\u0142 wyk\u0142adowy\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP01-l1_24-Co-to-jest-Big-Data.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Big Data &#8211; wprowadzenie<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP01-l1_24-Czym-jest-Hadoop.pdf\" data-type=\"link\" data-id=\"SP01-l1_24-Czym-jest-Hadoop.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Hadoop &#8211; wprowadzenie<\/a><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Warsztat\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP01_w1_24-\u015arodowisko-GCP.pdf\" data-type=\"link\" data-id=\"SP01_w1_24-\u015arodowisko-GCP.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Konfiguracja platformy GCP<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Spark &#8211; Wprowadzenie\n<ul class=\"wp-block-list\">\n<li>Materia\u0142 wyk\u0142adowy\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP02_l1_24-Spark-wprowadzenie.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Spark &#8211; wprowadzenie<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP02_l2_24-Spark-wprowadzenie-WordCount.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Spark Core &#8211; RDD API &#8211; na przyk\u0142adzie Hello Word<\/a><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Warsztat\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP02_w1_24-Spark-wprowadzenie-tutorial.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Spark &#8211; wprowadzenie &#8211; tutorial<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Spark SQL &#8211; DataFrame API &#8211; cz\u0119\u015b\u0107 1\n<ul class=\"wp-block-list\">\n<li>Materia\u0142 wyk\u0142adowy\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP04_l1_23-DataFrames-API-SQL.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Spark SQL &#8211; DataFrame API<\/a><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Warsztat\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP04_w1_24-Spark-DataFrames-API-SQL-zadania.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Spark SQL &#8211; DataFrame API &#8211; warsztat<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP04_w1_24-Spark-DataFrames-API-SQL-zadania.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">Spark SQL &#8211; DataFrame API &#8211; notatnik<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\" id=\"day2\">Dzie\u0144 2<\/h3>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li>Spark SQL &#8211; DataFrame API &#8211; cz\u0119\u015b\u0107 2\n<ul class=\"wp-block-list\">\n<li>Warsztat\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP04_w2_24-Spark-DataFrames-API-SQL-zadania.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Spark SQL &#8211; DataFrame API &#8211; warsztat 2<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP04_w2_24-Spark-DataFrames-API-SQL-zadania.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">Spark SQL &#8211; DataFrame API &#8211; notatnik 2<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Spark &#8211; ML\n<ul class=\"wp-block-list\">\n<li>Materia\u0142 wyk\u0142adowy\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP05_l1_23-Spark-ML.html\" target=\"_blank\" rel=\"noreferrer noopener\">Spark ML<\/a><\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Warsztat\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP05_w1_24-Spark-ML-zadania.pdf\" data-type=\"URL\" target=\"_blank\" rel=\"noreferrer noopener\">Spark ML &#8211; warsztat<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP05_w1_24-Spark-ML-zadania-solns.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">Spark ML &#8211; notatnik 1<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP05_w2_24-Spark-ML-zadania-solns.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">Spark ML &#8211; notatnik 2<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP05_w3_24-Spark-ML-zadania-solns.ipynb\" target=\"_blank\" rel=\"noreferrer noopener\">Spark ML &#8211; notatnik 3<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n<h3 class=\"wp-block-heading\">Dodatek<\/h3>\n\n\n\n<ol class=\"wp-block-list\">\n<li>Spark SQL &#8211; SQL API\n<ul class=\"wp-block-list\">\n<li>Warsztat\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP04-w3-24-Spark-SQL.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Spark SQL &#8211; SQL API &#8211; warsztat<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n\n\n\n<li>Spark Web UI\n<ul class=\"wp-block-list\">\n<li>Materia\u0142 wyk\u0142adowy\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/jankiewicz.pl\/szkolenia\/bigdata-sp\/SP09_l1_24-Spark-WebUI.pdf\" target=\"_blank\" rel=\"noreferrer noopener\">Spark Web UI<\/a><\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<\/li>\n<\/ol>\n\n\n\n\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Jeszcze niedawno dokumentacja Apache Spark przyk\u0142adowe fragmenty kodu w ramach dostarczanych API prezentowa\u0142a w kolejno\u015bci: Scala, Java, Python R.<br \/>\nOd wersji 3.5.0 ta kolejno\u015b\u0107 jest ju\u017c inna: Python, Scala, Java, R. Popularno\u015b\u0107 j\u0119zyka Python robi swoje, znaczenie dla \u015bwiata Data Science to jeszcze pot\u0119guje.  <\/p>\n","protected":false},"author":2,"featured_media":807,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"inline_featured_image":false,"footnotes":""},"categories":[7,56],"tags":[44,32,45,17,18],"class_list":["post-763","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-big-data","category-szkolenia","tag-gcp","tag-hadoop","tag-python","tag-spark","tag-sql"],"_links":{"self":[{"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/posts\/763","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/comments?post=763"}],"version-history":[{"count":36,"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/posts\/763\/revisions"}],"predecessor-version":[{"id":1145,"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/posts\/763\/revisions\/1145"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/media\/807"}],"wp:attachment":[{"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/media?parent=763"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/categories?post=763"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/jankiewicz.pl\/index.php\/wp-json\/wp\/v2\/tags?post=763"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}