{
    "componentChunkName": "component---src-templates-blog-post-js",
    "path": "/remove-duplicate-rows-from-a-pandas-dataframe/",
    "result": {"data":{"markdownRemark":{"html":"<p>Let’s read the <del>budget.xlsx</del> file into a DataFrame.</p>\n<pre class=\"grvsc-container synthwave-84\" data-language=\"py\" data-index=\"0\"><code class=\"grvsc-code\"><span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"1\"></span><span class=\"grvsc-source\"><span class=\"mtk10\">import</span><span class=\"mtk15\"> pandas </span><span class=\"mtk10\">as</span><span class=\"mtk15\"> pd</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"2\"></span><span class=\"grvsc-source\"></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"3\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">budget </span><span class=\"mtk12\">=</span><span class=\"mtk15\"> pd.</span><span class=\"mtk6\">read_excel</span><span class=\"mtk15\">(</span><span class=\"mtk16\">&quot;budget.xlsx&quot;</span><span class=\"mtk15\">)</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"4\"></span><span class=\"grvsc-source\"></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"5\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">budget</span></span></span></code></pre>\n<p><strong>Output:</strong></p>\n<p><span\n      class=\"gatsby-resp-image-wrapper\"\n      style=\"position: relative; display: block; margin-left: auto; margin-right: auto; max-width: 551px; \"\n    >\n      <a\n    class=\"gatsby-resp-image-link\"\n    href=\"/static/bfb12460a2f2cf12db2059209ce7fbaf/d1cfa/budget.png\"\n    style=\"display: block\"\n    target=\"_blank\"\n    rel=\"noopener\"\n  >\n    <span\n    class=\"gatsby-resp-image-background-image\"\n    style=\"padding-bottom: 70.5%; position: relative; bottom: 0; left: 0; background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAOCAIAAACgpqunAAAACXBIWXMAAA7DAAAOwwHHb6hkAAACH0lEQVQoz0WS2XLjIBBF5/8/LzVxqjKxtYAaISGaRYhdTNmxk/t++nKa/lPcntXWSq61KoVa65xTa23fd++9NsY5h4hCCGusNUZsGyJKKX0If9ojpRTOOaX08nEhlAJMn5+ft9vt/e/7OI4A0+Xjcrl8AEzjI29vbwDsCZ/J73I9k4exM9vKp7G//XNaboweBluJKGZG+pZ8jUcNDoZOzeQFR2e0DjGOIzHGSim/ui7lrFAaY3IuYhMALKYUYgze9wMRMP40H86akjNMk9t3JWXX9yVno9Dttp0VUfJ5riWXnHKKhEw40ydcS9aCB6to9yVmYMO1v34uQGj3ZXDN3mwcaH+Nhw67ik5P3U3M06u51t0d9TyXdUGtCSFkmmLKiOiOo7VmrIV5PlvLpdbz5HxWSv9uWyGG4MkwoFKMsevtFmJEuWmtck5SymEcYorBHzGGcRzXdX01n/U4jlIKAOy7SykBY7VWa7RzrrWGiAxYPc/yuAhCqUT541yUUt8jpUTO53EcjuMQ66qNTikZo4FBjDGEEGPs+34T2wuuxVpbSlmWRRtDKL3PlpIBeO9ba0prYPN51vIIJQQRn3BKaV1FjJFSqrXRWnd9H2NSUj4ONgshGLCU0rN5GH6dc85K6VIKY8xYOzMgdDoOvwnx7ayUonQ6786PZkq37fXsku+/EkJclsVYu22Cc74KMTPm3H1/EpFz/u0cvJ8AlFL/ATVJFxDZRaEuAAAAAElFTkSuQmCC'); background-size: cover; display: block;\"\n  ></span>\n  <img\n        class=\"gatsby-resp-image-image\"\n        alt=\"Budget\"\n        title=\"Budget\"\n        src=\"/static/bfb12460a2f2cf12db2059209ce7fbaf/d1cfa/budget.png\"\n        srcset=\"/static/bfb12460a2f2cf12db2059209ce7fbaf/56d15/budget.png 200w,\n/static/bfb12460a2f2cf12db2059209ce7fbaf/d9f49/budget.png 400w,\n/static/bfb12460a2f2cf12db2059209ce7fbaf/d1cfa/budget.png 551w\"\n        sizes=\"(max-width: 551px) 100vw, 551px\"\n        style=\"width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0;\"\n        loading=\"lazy\"\n        decoding=\"async\"\n      />\n  </a>\n    </span></p>\n<p>We can see that we have duplicate rows in our DataFrame.</p>\n<p>We can remove them using the <del>drop_duplicates()</del> method.</p>\n<pre class=\"grvsc-container synthwave-84\" data-language=\"py\" data-index=\"1\"><code class=\"grvsc-code\"><span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"1\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">budget.</span><span class=\"mtk6\">drop_duplicates</span><span class=\"mtk15\">(</span><span class=\"mtk8 mtki\">inplace</span><span class=\"mtk15\"> </span><span class=\"mtk12\">=</span><span class=\"mtk15\"> </span><span class=\"mtk5\">True</span><span class=\"mtk15\">)</span></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"2\"></span><span class=\"grvsc-source\"></span></span>\n<span class=\"grvsc-line\"><span class=\"grvsc-gutter-pad\"></span><span class=\"grvsc-gutter grvsc-line-number\" aria-hidden=\"true\" data-content=\"3\"></span><span class=\"grvsc-source\"><span class=\"mtk15\">budget</span></span></span></code></pre>\n<p><strong>Output:</strong></p>\n<p><span\n      class=\"gatsby-resp-image-wrapper\"\n      style=\"position: relative; display: block; margin-left: auto; margin-right: auto; max-width: 551px; \"\n    >\n      <a\n    class=\"gatsby-resp-image-link\"\n    href=\"/static/2695ddec61c3d131f3a9bda61344f343/d1cfa/duplicatesRemoved.png\"\n    style=\"display: block\"\n    target=\"_blank\"\n    rel=\"noopener\"\n  >\n    <span\n    class=\"gatsby-resp-image-background-image\"\n    style=\"padding-bottom: 61%; position: relative; bottom: 0; left: 0; background-image: url('data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAABQAAAAMCAIAAADtbgqsAAAACXBIWXMAAA7DAAAOwwHHb6hkAAABwklEQVQoz02S627cIBBG8/4v11Zptk23qg0Od7C5GBjAeFuvs9rO35lPR+fTvDTvqrd77601Y7R1rm3bvvcQQlyjtdb7YLRWSnnvnXNKa23MPBsAeLndBwA45wihy+UHIRQjdL3+/nW9Xt7epgkTQl6/v76//6SUYozHcfj29YvS+jNca00p7b1TSmNMhBBCWUp51rq1drvdtDFKq/N4a40Qsob1GfY+pJQQxtZZSgiePpRSlJKYUq1VSim4KKUAQE4JY+ycf4ZjjL13IcSyLCMaGRcxJaNUKWXf93mZpZT7vm/3YZSu64NcSrHWAWSEkPd+GMcRoVqKVnKN60FWklBaS4UDncdxtNZ+hrdtyzm3u8yxhcwF773bZYZSbvuutJJKPcgNYRxCeJKdczlnhLCzVghBCck5KylijLXWeZ6F4KczQB7+DM65/5zXtbUmpVysHYaBUKaU4ozVWu/Oi5Cy935wW5sO8sM552y0AcgfhIQ1zsaMCMHd+SQfbQt5klNKwzBY+yBDAed8a01wvoZACOFCFiiz1p9tm5kLcTof5GnyPjw/bFkWAGDsCHPBOGNcCM5Yyvls+0mOcZqmf4X9BYIOpqQRd0rIAAAAAElFTkSuQmCC'); background-size: cover; display: block;\"\n  ></span>\n  <img\n        class=\"gatsby-resp-image-image\"\n        alt=\"Duplicate Rows Removed\"\n        title=\"Duplicate Rows Removed\"\n        src=\"/static/2695ddec61c3d131f3a9bda61344f343/d1cfa/duplicatesRemoved.png\"\n        srcset=\"/static/2695ddec61c3d131f3a9bda61344f343/56d15/duplicatesRemoved.png 200w,\n/static/2695ddec61c3d131f3a9bda61344f343/d9f49/duplicatesRemoved.png 400w,\n/static/2695ddec61c3d131f3a9bda61344f343/d1cfa/duplicatesRemoved.png 551w\"\n        sizes=\"(max-width: 551px) 100vw, 551px\"\n        style=\"width:100%;height:100%;margin:0;vertical-align:middle;position:absolute;top:0;left:0;\"\n        loading=\"lazy\"\n        decoding=\"async\"\n      />\n  </a>\n    </span></p>\n<p>The duplicate rows have been removed.</p>\n<h6 id=\"learn-how-to-find-duplicate-rows-in-a-pandas-dataframe-in-my-blog-post-here\" style=\"position:relative;\"><a href=\"#learn-how-to-find-duplicate-rows-in-a-pandas-dataframe-in-my-blog-post-here\" aria-label=\"learn how to find duplicate rows in a pandas dataframe in my blog post here permalink\" class=\"anchor before\"><svg aria-hidden=\"true\" focusable=\"false\" height=\"16\" version=\"1.1\" viewBox=\"0 0 16 16\" width=\"16\"><path fill-rule=\"evenodd\" d=\"M4 9h1v1H4c-1.5 0-3-1.69-3-3.5S2.55 3 4 3h4c1.45 0 3 1.69 3 3.5 0 1.41-.91 2.72-2 3.25V8.59c.58-.45 1-1.27 1-2.09C10 5.22 8.98 4 8 4H4c-.98 0-2 1.22-2 2.5S3 9 4 9zm9-3h-1v1h1c1 0 2 1.22 2 2.5S13.98 12 13 12H9c-.98 0-2-1.22-2-2.5 0-.83.42-1.64 1-2.09V6.25c-1.09.53-2 1.84-2 3.25C6 11.31 7.55 13 9 13h4c1.45 0 3-1.69 3-3.5S14.5 6 13 6z\"></path></svg></a>Learn how to find duplicate rows in a pandas DataFrame in my blog post <a href=\"https://hemanta.io/find-duplicate-rows-in-a-pandas-dataframe/\">here</a>.</h6>\n<style class=\"grvsc-styles\">\n  .grvsc-container {\n    overflow: auto;\n    position: relative;\n    -webkit-overflow-scrolling: touch;\n    padding-top: 1rem;\n    padding-top: var(--grvsc-padding-top, var(--grvsc-padding-v, 1rem));\n    padding-bottom: 1rem;\n    padding-bottom: var(--grvsc-padding-bottom, var(--grvsc-padding-v, 1rem));\n    border-radius: 8px;\n    border-radius: var(--grvsc-border-radius, 8px);\n    font-feature-settings: normal;\n    line-height: 1.4;\n  }\n  \n  .grvsc-code {\n    display: table;\n  }\n  \n  .grvsc-line {\n    display: table-row;\n    box-sizing: border-box;\n    width: 100%;\n    position: relative;\n  }\n  \n  .grvsc-line > * {\n    position: relative;\n  }\n  \n  .grvsc-gutter-pad {\n    display: table-cell;\n    padding-left: 0.75rem;\n    padding-left: calc(var(--grvsc-padding-left, var(--grvsc-padding-h, 1.5rem)) / 2);\n  }\n  \n  .grvsc-gutter {\n    display: table-cell;\n    -webkit-user-select: none;\n    -moz-user-select: none;\n    user-select: none;\n  }\n  \n  .grvsc-gutter::before {\n    content: attr(data-content);\n  }\n  \n  .grvsc-source {\n    display: table-cell;\n    padding-left: 1.5rem;\n    padding-left: var(--grvsc-padding-left, var(--grvsc-padding-h, 1.5rem));\n    padding-right: 1.5rem;\n    padding-right: var(--grvsc-padding-right, var(--grvsc-padding-h, 1.5rem));\n  }\n  \n  .grvsc-source:empty::after {\n    content: ' ';\n    -webkit-user-select: none;\n    -moz-user-select: none;\n    user-select: none;\n  }\n  \n  .grvsc-gutter + .grvsc-source {\n    padding-left: 0.75rem;\n    padding-left: calc(var(--grvsc-padding-left, var(--grvsc-padding-h, 1.5rem)) / 2);\n  }\n  \n  /* Line transformer styles */\n  \n  .grvsc-has-line-highlighting > .grvsc-code > .grvsc-line::before {\n    content: ' ';\n    position: absolute;\n    width: 100%;\n  }\n  \n  .grvsc-line-diff-add::before {\n    background-color: var(--grvsc-line-diff-add-background-color, rgba(0, 255, 60, 0.2));\n  }\n  \n  .grvsc-line-diff-del::before {\n    background-color: var(--grvsc-line-diff-del-background-color, rgba(255, 0, 20, 0.2));\n  }\n  \n  .grvsc-line-number {\n    padding: 0 2px;\n    text-align: right;\n    opacity: 0.7;\n  }\n  \n  .synthwave-84 { background-color: #262335; }\n  .synthwave-84 .mtki { font-style: italic; }\n  .synthwave-84 .mtk10 { color: #FEDE5D; }\n  .synthwave-84 .mtk15 { color: #FF7EDBFF; }\n  .synthwave-84 .mtk12 { color: #FFFFFFEE; }\n  .synthwave-84 .mtk6 { color: #36F9F6; }\n  .synthwave-84 .mtk16 { color: #FF8B39; }\n  .synthwave-84 .mtk8 { color: #72F1B8; }\n  .synthwave-84 .mtk5 { color: #F97E72; }\n  .synthwave-84 .grvsc-line-highlighted::before {\n    background-color: var(--grvsc-line-highlighted-background-color, rgba(255, 255, 255, 0.1));\n    box-shadow: inset var(--grvsc-line-highlighted-border-width, 4px) 0 0 0 var(--grvsc-line-highlighted-border-color, rgba(255, 255, 255, 0.5));\n  }\n</style>","frontmatter":{"title":"Remove Duplicate Rows in a Pandas DataFrame","date":"2021-08-08"}}},"pageContext":{"slug":"/remove-duplicate-rows-from-a-pandas-dataframe/","prev":{"fields":{"slug":"/pandas-isnull-and-notnull/"},"frontmatter":{"modules":null}},"next":{"fields":{"slug":"/convert-floating-point-numbers-to-integers-in-pandas/"},"frontmatter":{"modules":null}}}},
    "staticQueryHashes": ["3159585216"]}