{
    "componentChunkName": "component---src-templates-blog-post-js",
    "path": "/memory-optimization-in-pandas-dataframes/",
    "result": {"data":{"markdownRemark":{"html":"<p>Let’s read an Excel file into a DataFrame:</p>\n<pre class=\"grvsc-container synthwave-84\" data-language=\"py\" data-index=\"0\"><code class=\"grvsc-code\"><span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"1\"></span><span class=\"grvsc-source\"><span class=\"mtk10\">import</span><span class=\"mtk15\"> pandas </span><span class=\"mtk10\">as</span><span class=\"mtk15\"> pd</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"2\"></span><span class=\"grvsc-source\"></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"3\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">nsv </span><span class=\"mtk12\">=</span><span class=\"mtk15\"> pd.</span><span class=\"mtk6\">read_excel</span><span class=\"mtk15\">(</span><span class=\"mtk16\">&quot;./NSV/PA/August/PA-AUGUST-19.xlsx&quot;</span><span class=\"mtk15\">, </span><span class=\"mtk8 mtki\">sheet_name</span><span class=\"mtk15\"> </span><span class=\"mtk12\">=</span><span class=\"mtk15\"> </span><span class=\"mtk16\">&quot;Data&quot;</span><span class=\"mtk15\">)</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"4\"></span><span class=\"grvsc-source\"></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"5\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">nsv.</span><span class=\"mtk6\">info</span><span class=\"mtk15\">()</span></span></span></code></pre>\n<p><strong>Output:</strong></p>\n<p><span\n      class=\"gatsby-resp-image-wrapper\"\n      style=\"position: relative; display: block; margin-left: auto; margin-right: auto; max-width: 496px; \"\n    >\n      <a\n    class=\"gatsby-resp-image-link\"\n    href=\"/static/1bc20e43cb7bd4f28b0f0137568da953/60009/dataframe.png\"\n    style=\"display: block\"\n    target=\"_blank\"\n    rel=\"noopener\"\n  >\n    <span\n    class=\"gatsby-resp-image-background-image\"\n    style=\"padding-bottom: 103%; position: relative; bottom: 0; left: 0; background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAVCAIAAADJt1n/AAAACXBIWXMAAA7DAAAOwwHHb6hkAAACvElEQVQ4y22Ux5LbMBBE9/9/zRfXSqLEjEEGA0iCAUkruhRqa21rjiw8dk/PAB9KqSRJ6hphDLw8Z5+/GEZACKOMMt61HRfCOrfvtxjj9e/6WJa117ppVFYUIcYQ4mwWSjkAcMo4587ZRsnJmP2/+ogxOmdnY7TWy7zu+369XruuM8YsyzIMw7ZufdfP87zv++12+wv2IZjZVHXVtO04Tvu+B++PxwMhBGOCCR7HKU8zguGNcghhW1cEwGXTtI2zNsaYJAmljHGBMdZ6qIoCAE3GrOsS4vUHHOO2rWVZci6en0IIaZZyzrkQlNJhGFFVYwx93w+6t87/sO39ssyMsXleXrD3x9OBUEooBYBpnIo0o5Tsj6b/s72tlBAAvC7rU/lyuTDOlWoYZeM4QY04Z28CCyEsy0IwFGXtHpaC94fDoUYAGCOEtB6y8wUQehOY936aBoyJkM13z3mRS6W4kJRRMxlUVozRN7Bztm0bAJRl2dPV3XaW0kdghJBhGKu84O9h77q+JYDSPDNm9t6H4I+nIyaEEAoYBj2issKU/NPwy3avOwCQTWPMtG5bDCFNL5wLKSWldBxGSnGe5e7HkF6wtVYpVVXlat33qE7fygB9rwFVyflsHwd+yt9ta90TDOaxva8lSS9cCPboeRxGQiDLsnews23XVmVFGA8hvJSTE2VciKftiVGSpu/gzW5SSgDAjD4j8f4ZGAbAVV1pPZZpzgR/syTO2r7rGaNlUSilYoze+zzPhZRCSs7ZNBlU1c8Ne/Lfv7jD93tT1cfTaRzur8I8myQ54XtiFGM8jWN2TinGb+ZszARQS6Eo44Me2rZTqimLsm1bKWRdl4wLKSQlpO26dVt13/Va97qPMX4MrC4+f2lW1clvUV4USml6mBrx9XXz3q/raq27Xr+ctdu2xfv9vZdz7na7/QHSt5zOduZAOwAAAABJRU5ErkJggg=='); background-size: cover; display: block;\"\n  ></span>\n  <img\n        class=\"gatsby-resp-image-image\"\n        alt=\"DataFrame\"\n        title=\"DataFrame\"\n        src=\"/static/1bc20e43cb7bd4f28b0f0137568da953/60009/dataframe.png\"\n        srcset=\"/static/1bc20e43cb7bd4f28b0f0137568da953/56d15/dataframe.png 200w,\n/static/1bc20e43cb7bd4f28b0f0137568da953/d9f49/dataframe.png 400w,\n/static/1bc20e43cb7bd4f28b0f0137568da953/60009/dataframe.png 496w\"\n        sizes=\"(max-width: 496px) 100vw, 496px\"\n        style=\"width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0;\"\n        loading=\"lazy\"\n        decoding=\"async\"\n      />\n  </a>\n    </span></p>\n<p>We have 18429 rows in our DataFrame and the memory usage is 3.4+ MB.</p>\n<p>We can employ a method to optimize memory usage.</p>\n<p>We will have to find the columns where there are a few unique values and then convert those columns to categorical data type.</p>\n<p>The following columns have a few unique values:</p>\n<ul>\n<li>ARM</li>\n<li>LINE</li>\n<li>SEASON</li>\n</ul>\n<p>We can check the unique values in a column using the <del>nunique()</del> method.</p>\n<h6 id=\"learn-how-to-count-unique-values-in-a-column-in-my-blog-post-here\" style=\"position:relative;\"><a href=\"#learn-how-to-count-unique-values-in-a-column-in-my-blog-post-here\" aria-label=\"learn how to count unique values in a column in my blog post here permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Learn how to count unique values in a column in my blog post <a href=\"https://hemanta.io/count-the-unique-values-in-a-column-in-a-pandas-dataframe/\">here</a>.</h6>\n<pre class=\"grvsc-container synthwave-84\" data-language=\"py\" data-index=\"1\"><code class=\"grvsc-code\"><span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"1\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">nsv[</span><span class=\"mtk16\">&quot;ARM&quot;</span><span class=\"mtk15\">].</span><span class=\"mtk6\">nunique</span><span class=\"mtk15\">()</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"2\"></span><span class=\"grvsc-source\"><span class=\"mtk5\">7</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"3\"></span><span class=\"grvsc-source\"></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"4\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">nsv[</span><span class=\"mtk16\">&quot;LINE&quot;</span><span class=\"mtk15\">].</span><span class=\"mtk6\">nunique</span><span class=\"mtk15\">()</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"5\"></span><span class=\"grvsc-source\"><span class=\"mtk5\">7</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"6\"></span><span class=\"grvsc-source\"></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"7\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">nsv[</span><span class=\"mtk16\">&quot;SEASON&quot;</span><span class=\"mtk15\">].</span><span class=\"mtk6\">nunique</span><span class=\"mtk15\">()</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"8\"></span><span class=\"grvsc-source\"><span class=\"mtk5\">14</span></span></span></code></pre>\n<p>Let’s convert these columns to categorical data types.</p>\n<pre class=\"grvsc-container synthwave-84\" data-language=\"py\" data-index=\"2\"><code class=\"grvsc-code\"><span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"1\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">nsv[</span><span class=\"mtk16\">&quot;ARM&quot;</span><span class=\"mtk15\">] </span><span class=\"mtk12\">=</span><span class=\"mtk15\"> nsv[</span><span class=\"mtk16\">&quot;ARM&quot;</span><span class=\"mtk15\">].</span><span class=\"mtk6\">astype</span><span class=\"mtk15\">(</span><span class=\"mtk16\">&quot;category&quot;</span><span class=\"mtk15\">)</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"2\"></span><span class=\"grvsc-source\"></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"3\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">nsv[</span><span class=\"mtk16\">&quot;LINE&quot;</span><span class=\"mtk15\">] </span><span class=\"mtk12\">=</span><span class=\"mtk15\"> nsv[</span><span class=\"mtk16\">&quot;LINE&quot;</span><span class=\"mtk15\">].</span><span class=\"mtk6\">astype</span><span class=\"mtk15\">(</span><span class=\"mtk16\">&quot;category&quot;</span><span class=\"mtk15\">)</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"4\"></span><span class=\"grvsc-source\"></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"5\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">nsv[</span><span class=\"mtk16\">&quot;SEASON&quot;</span><span class=\"mtk15\">] </span><span class=\"mtk12\">=</span><span class=\"mtk15\"> nsv[</span><span class=\"mtk16\">&quot;SEASON&quot;</span><span class=\"mtk15\">].</span><span class=\"mtk6\">astype</span><span class=\"mtk15\">(</span><span class=\"mtk16\">&quot;category&quot;</span><span class=\"mtk15\">)</span></span></span></code></pre>\n<p>Let’s check the memory usage now:</p>\n<pre class=\"grvsc-container synthwave-84\" data-language=\"py\" data-index=\"3\"><code class=\"grvsc-code\"><span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"1\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">nsv.</span><span class=\"mtk6\">info</span><span class=\"mtk15\">()</span></span></span></code></pre>\n<p><strong>Output:</strong></p>\n<p><span\n      class=\"gatsby-resp-image-wrapper\"\n      style=\"position: relative; display: block; margin-left: auto; margin-right: auto; max-width: 565px; \"\n    >\n      <a\n    class=\"gatsby-resp-image-link\"\n    href=\"/static/e3306bf33452f78ad1057003a173140a/a1277/memorySaved.png\"\n    style=\"display: block\"\n    target=\"_blank\"\n    rel=\"noopener\"\n  >\n    <span\n    class=\"gatsby-resp-image-background-image\"\n    style=\"padding-bottom: 90.49999999999999%; position: relative; bottom: 0; left: 0; background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAASCAIAAADUsmlHAAAACXBIWXMAAA7DAAAOwwHHb6hkAAACQUlEQVQ4y3WSS9OTMBiFv///c1y4c3SjG1dOlZYWEnIDciFAyJ22OPRTN7Znlcs8c05O3reWsculqmvA+x4AQCkllEIIWdtJKad5vl6vtxd6SzGGGIUQztlt21JK0zQLKadx8sEvZl6W/fyp3kIM4zRRSqQa7vf7mlNV15QxRpmUkhGCML7dbs/hlLIxc4Ox0to6e72uVV0zxrq2U8PQ933xq5hnc3/oOYwQcs7f77c1ZwBh13Xt/mbFKG1Qw3k/6mG9/QeHEMZRAwBvj7uUwuHXzxrUe4VCoKY5HA7O+23bnjon7x1CSEm1bdu6Ppz7Xgo5aK2UKsuzWZbncIjRLAsA0Fq3bVtO8XypMCYYEyElhhAA8D/2B3bO/mnFmD12DMeyhA2EEO6xISyKIqb0HI4pDFpjjGYz77FzbhDmgkuptNZ923EuXv6zD77ruqquuOCzMSmG06lECDUN6jlvGSnLMoYXzt578tD7Pqd4PJ2apkEIc84pxUVReBcehT2JHfU4SCnf79acMSVCCPWIrZTAGIdXztZZxtryfDHWvhf2sygAhKCGfd9Tgk9lmdL6ArZW9Lw8l4PWe+ycztWFEMJoq9Q+nlVV279D8v5n/xb7hE1aQ1BTgqdRO2satFcluNSj7ru2quqcXzhP1kEmatwxOQ7G68Udz+fj8XQ8Hgkh4HyBEE7TZBYTYxzH0XvvnJvnKYTwpg5fyLcP0+Ez/vpB/fjEv3+0HIWUvbPOh5TTuq5mWZxzOeXF2vDQsiwxxd/Sd/pdNnuh5QAAAABJRU5ErkJggg=='); background-size: cover; display: block;\"\n  ></span>\n  <img\n        class=\"gatsby-resp-image-image\"\n        alt=\"Memory Usage Reduction\"\n        title=\"Memory Usage Reduction\"\n        src=\"/static/e3306bf33452f78ad1057003a173140a/a1277/memorySaved.png\"\n        srcset=\"/static/e3306bf33452f78ad1057003a173140a/56d15/memorySaved.png 200w,\n/static/e3306bf33452f78ad1057003a173140a/d9f49/memorySaved.png 400w,\n/static/e3306bf33452f78ad1057003a173140a/a1277/memorySaved.png 565w\"\n        sizes=\"(max-width: 565px) 100vw, 565px\"\n        style=\"width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0;\"\n        loading=\"lazy\"\n        decoding=\"async\"\n      />\n  </a>\n    </span></p>\n<p>The new memory usage is 2.4MB, which is a reduction of approximately 30%.</p>\n<style class=\"grvsc-styles\">\n  .grvsc-container {\n    overflow: auto;\n    position: relative;\n    -webkit-overflow-scrolling: touch;\n    padding-top: 1rem;\n    padding-top: var(--grvsc-padding-top, var(--grvsc-padding-v, 1rem));\n    padding-bottom: 1rem;\n    padding-bottom: var(--grvsc-padding-bottom, var(--grvsc-padding-v, 1rem));\n    border-radius: 8px;\n    border-radius: var(--grvsc-border-radius, 8px);\n    font-feature-settings: normal;\n    line-height: 1.4;\n  }\n  \n  .grvsc-code {\n    display: table;\n  }\n  \n  .grvsc-line {\n    display: table-row;\n    box-sizing: border-box;\n    width: 100%;\n    position: relative;\n  }\n  \n  .grvsc-line > * {\n    position: relative;\n  }\n  \n  .grvsc-gutter-pad {\n    display: table-cell;\n    padding-left: 0.75rem;\n    padding-left: calc(var(--grvsc-padding-left, var(--grvsc-padding-h, 1.5rem)) / 2);\n  }\n  \n  .grvsc-gutter {\n    display: table-cell;\n    -webkit-user-select: none;\n    -moz-user-select: none;\n    user-select: none;\n  }\n  \n  .grvsc-gutter::before {\n    content: attr(data-content);\n  }\n  \n  .grvsc-source {\n    display: table-cell;\n    padding-left: 1.5rem;\n    padding-left: var(--grvsc-padding-left, var(--grvsc-padding-h, 1.5rem));\n    padding-right: 1.5rem;\n    padding-right: var(--grvsc-padding-right, var(--grvsc-padding-h, 1.5rem));\n  }\n  \n  .grvsc-source:empty::after {\n    content: ' ';\n    -webkit-user-select: none;\n    -moz-user-select: none;\n    user-select: none;\n  }\n  \n  .grvsc-gutter + .grvsc-source {\n    padding-left: 0.75rem;\n    padding-left: calc(var(--grvsc-padding-left, var(--grvsc-padding-h, 1.5rem)) / 2);\n  }\n  \n  /* Line transformer styles */\n  \n  .grvsc-has-line-highlighting > .grvsc-code > .grvsc-line::before {\n    content: ' ';\n    position: absolute;\n    width: 100%;\n  }\n  \n  .grvsc-line-diff-add::before {\n    background-color: var(--grvsc-line-diff-add-background-color, rgba(0, 255, 60, 0.2));\n  }\n  \n  .grvsc-line-diff-del::before {\n    background-color: var(--grvsc-line-diff-del-background-color, rgba(255, 0, 20, 0.2));\n  }\n  \n  .grvsc-line-number {\n    padding: 0 2px;\n    text-align: right;\n    opacity: 0.7;\n  }\n  \n  .synthwave-84 { background-color: #262335; }\n  .synthwave-84 .mtki { font-style: italic; }\n  .synthwave-84 .mtk10 { color: #FEDE5D; }\n  .synthwave-84 .mtk15 { color: #FF7EDBFF; }\n  .synthwave-84 .mtk12 { color: #FFFFFFEE; }\n  .synthwave-84 .mtk6 { color: #36F9F6; }\n  .synthwave-84 .mtk16 { color: #FF8B39; }\n  .synthwave-84 .mtk8 { color: #72F1B8; }\n  .synthwave-84 .mtk5 { color: #F97E72; }\n  .synthwave-84 .grvsc-line-highlighted::before {\n    background-color: var(--grvsc-line-highlighted-background-color, rgba(255, 255, 255, 0.1));\n    box-shadow: inset var(--grvsc-line-highlighted-border-width, 4px) 0 0 0 var(--grvsc-line-highlighted-border-color, rgba(255, 255, 255, 0.5));\n  }\n</style>","frontmatter":{"title":"Memory Optimization in Pandas DataFrames","date":"2021-08-09"}}},"pageContext":{"slug":"/memory-optimization-in-pandas-dataframes/","prev":{"fields":{"slug":"/creating-spreadsheet-style-pivot-tables-in-pandas/"},"frontmatter":{"modules":null}},"next":{"fields":{"slug":"/count-the-unique-values-in-a-column-in-a-pandas-dataframe/"},"frontmatter":{"modules":null}}}},
    "staticQueryHashes": ["3159585216"]}