Why does the `updatedb` program run so fast?












19














Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?










share|improve this question





























    19














    Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?










    share|improve this question



























      19












      19








      19


      1





      Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?










      share|improve this question















      Usually when I have programs that are doing a full disk scan and going over all files in the system they take a very long time to run. Why does updatedb run so fast in comparison?







      performance updatedb






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited 3 hours ago









      Jeff Schaller

      38.8k1053125




      38.8k1053125










      asked 23 hours ago









      hugomg

      1,82731634




      1,82731634






















          2 Answers
          2






          active

          oldest

          votes


















          20














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer























          • Fairly good question and answer, did not even know there were "differencial" scannings.
            – Rui F Ribeiro
            23 hours ago








          • 1




            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
            – hugomg
            22 hours ago






          • 4




            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
            – Kusalananda
            22 hours ago










          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
            – Sergiy Kolodyazhnyy
            14 hours ago










          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.
            – Kevin
            14 hours ago



















          8














          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer





















          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
            – Stephen Kitt
            7 hours ago










          • Indeed. I was describing the defaults for Fedora.
            – hugomg
            23 mins ago











          Your Answer








          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "106"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492044%2fwhy-does-the-updatedb-program-run-so-fast%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          2 Answers
          2






          active

          oldest

          votes








          2 Answers
          2






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          20














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer























          • Fairly good question and answer, did not even know there were "differencial" scannings.
            – Rui F Ribeiro
            23 hours ago








          • 1




            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
            – hugomg
            22 hours ago






          • 4




            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
            – Kusalananda
            22 hours ago










          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
            – Sergiy Kolodyazhnyy
            14 hours ago










          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.
            – Kevin
            14 hours ago
















          20














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer























          • Fairly good question and answer, did not even know there were "differencial" scannings.
            – Rui F Ribeiro
            23 hours ago








          • 1




            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
            – hugomg
            22 hours ago






          • 4




            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
            – Kusalananda
            22 hours ago










          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
            – Sergiy Kolodyazhnyy
            14 hours ago










          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.
            – Kevin
            14 hours ago














          20












          20








          20






          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).






          share|improve this answer














          The answer depends on the version of locate you’re using, but there’s a fair chance it’s mlocate, whose updatedb runs quickly by avoiding doing full disk scans:




          mlocate is a locate/updatedb implementation. The 'm' stands for "merging":
          updatedb reuses the existing database to avoid rereading most of the file
          system, which makes updatedb faster and does not trash the system caches as
          much.




          (The database stores each directory’s timestamp, ctime or mtime, whichever is newer.)



          Like most implementations of updatedb, mlocate’s will also skip file systems and paths which it is configured to ignore. By default there are none in mlocate’s case, but distributions typically provide a basic updatedb.conf which ignores networked file systems, virtual file systems etc. (see Debian’s configuration file for example; this is standard practice in Debian, so GNU’s updatedb is configured similarly).







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 7 hours ago

























          answered 23 hours ago









          Stephen Kitt

          164k24366445




          164k24366445












          • Fairly good question and answer, did not even know there were "differencial" scannings.
            – Rui F Ribeiro
            23 hours ago








          • 1




            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
            – hugomg
            22 hours ago






          • 4




            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
            – Kusalananda
            22 hours ago










          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
            – Sergiy Kolodyazhnyy
            14 hours ago










          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.
            – Kevin
            14 hours ago


















          • Fairly good question and answer, did not even know there were "differencial" scannings.
            – Rui F Ribeiro
            23 hours ago








          • 1




            Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
            – hugomg
            22 hours ago






          • 4




            @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
            – Kusalananda
            22 hours ago










          • So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
            – Sergiy Kolodyazhnyy
            14 hours ago










          • @Sergiy: Of course. locate isn't grep -R. It does not read file content.
            – Kevin
            14 hours ago
















          Fairly good question and answer, did not even know there were "differencial" scannings.
          – Rui F Ribeiro
          23 hours ago






          Fairly good question and answer, did not even know there were "differencial" scannings.
          – Rui F Ribeiro
          23 hours ago






          1




          1




          Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
          – hugomg
          22 hours ago




          Thanks! I had never noticed that modifying a file also changes the ctime and mtime of all its parent directories.
          – hugomg
          22 hours ago




          4




          4




          @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
          – Kusalananda
          22 hours ago




          @hugomg I don't think it actually does. It should only change the mtime of its immediate parent.
          – Kusalananda
          22 hours ago












          So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
          – Sergiy Kolodyazhnyy
          14 hours ago




          So if I understand it correctly, mlocate cares about ctime and mtime which implies it cares only of whether list of directory entries is still the same ( no removed or added files), which means it doesn't care about actual files themselves. Is that correct ?
          – Sergiy Kolodyazhnyy
          14 hours ago












          @Sergiy: Of course. locate isn't grep -R. It does not read file content.
          – Kevin
          14 hours ago




          @Sergiy: Of course. locate isn't grep -R. It does not read file content.
          – Kevin
          14 hours ago













          8














          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer





















          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
            – Stephen Kitt
            7 hours ago










          • Indeed. I was describing the defaults for Fedora.
            – hugomg
            23 mins ago
















          8














          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer





















          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
            – Stephen Kitt
            7 hours ago










          • Indeed. I was describing the defaults for Fedora.
            – hugomg
            23 mins ago














          8












          8








          8






          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).






          share|improve this answer












          In addition to checking modification times, mlocate also ignores certain subtrees of the file system that have lots of uninteresting or potentially duplicate files, as specified in /etc/updatedb.conf (and described in man updatedb.conf):




          • Bind mounts

          • Some kinds of file systems (9p, afs, bdev, etc)

          • VCS repository databases (.git, .hg, etc)

          • Some hard-coded directories (/media, /tmp, /var/spool/cups, etc).







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered 16 hours ago









          hugomg

          1,82731634




          1,82731634












          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
            – Stephen Kitt
            7 hours ago










          • Indeed. I was describing the defaults for Fedora.
            – hugomg
            23 mins ago


















          • This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
            – Stephen Kitt
            7 hours ago










          • Indeed. I was describing the defaults for Fedora.
            – hugomg
            23 mins ago
















          This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
          – Stephen Kitt
          7 hours ago




          This isn’t the case by default though, so the base behaviour depends on the distribution being used. (Other updatedb implementations also support configured exclusions.)
          – Stephen Kitt
          7 hours ago












          Indeed. I was describing the defaults for Fedora.
          – hugomg
          23 mins ago




          Indeed. I was describing the defaults for Fedora.
          – hugomg
          23 mins ago


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Unix & Linux Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f492044%2fwhy-does-the-updatedb-program-run-so-fast%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          An IMO inspired problem

          Management

          Has there ever been an instance of an active nuclear power plant within or near a war zone?